I was having performance problems with make-pl. It took a full second to plan a build distributed across ten directories. This was small compared to the time the build actually took, but it would take the whole second to plan even if it determined that it didn't need to build anything. At first I thought that this sloth was the inevitable result of using an interpreted language instead of a compiled one, but I set out to see if I could squeeze a bit more speed out of Perl.
My first angle of attack was to replace hashes as my primary data structure with arrays, because hashes are somewhat slower than arrays, and are known to be one of the main things that slow scripting languages down. However, this made no noticeable difference in speed.
My next candidate was the heavy use of calls to
chdir, since they require a kernel syscall. So I built a test program that did nothing but changing directories. Then I implemented an optimization whereby when the program went to change directories, it would skip the call to
chdir if the current working directory was the same as the new one. Surprisingly, this optimization made the test program multiple orders of magnitude slower!
It turns out that calling
Cwd::cwd (the portable way to get the current directory in Perl) is far, far slower than calling
chdir. This is because
cwd actually spawns a whole new process to find the current directory. The
Cwd module also provides a function called
fastcwd, with the warning "It might conceivably
chdir you out of a directory that it can't
chdir you back into." Needless to say, even
fastcwd is much slower than a single
chdir. With this new information, I made make-pl use its own variable to keep track of the current directory, reducing the number of calls to
cwd to one. And thus the time it took to plan a build dropped from one second to a tenth of a second.