In a posting to
Jeffrey Mark Siskind offered two benchmarks as a challenge to the
functional programming community:
I have enclosed below two benchmarks that compute the same thing. One is written in a functional style and the other is written in an imperative style. The benchmarks are written in R4RS Scheme and implement a multidimensional clustering algorithm based on the EM algorithm.
Siskind reported that the functional version was 1.44 to 2.69 times as slow as the imperative version, depending on the implementation of Scheme. He continued:
So you may say, OK, all of the Scheme implementations that you tested are broken. Well, here is my challenge. Rewrite these benchmarks faithfully in your favourite language. Any language that supports both a functional and imperative style. And run them in your favourite implementation of that language. I predict that you will get similar results.
In each of the four implementations of Scheme that I tried, the imperative
version of Siskind's EM benchmark (
was at least twice as fast as the functional version (
The main reason for this discrepancy is that the functional version was
written in a very inefficient style that resulted in six times as many
procedure calls to top-level and anonymous procedures as in the imperative
For example, both
a procedure named
is coded using side effects in both
EM-FUN, this is reasonable because
DETERMINANT is being used as if it were a built-in library
procedure. What is not reasonable is that
do not use the same code for
(do ((i 0 (+ i 1))) ((= i n)) (do ((j 0 (+ j 1))) ((= j n)) (matrix-set! b i j (matrix-ref a i j))))the corresponding loops are implemented in
(for-each-n (lambda (i) (for-each-n (lambda (j) (matrix-set! b i j (matrix-ref a i j))) n)) n)Both of these code fragments perform exactly the same assignments, so this has nothing to do with imperative versus functional programming. All it shows is that efficient imperative code is faster than inefficient imperative code.
I cleaned up
EM-FUN to make it more faithful to the
algorithms and coding styles that were used in
As part of the cleanup, I moved primitives like
into a separate file that is shared by both the imperative and functional
versions. In Siskind's original code, the boundary between the primitives
and the code that used those primitives was indicated by the comment
;;; EMSome primitives, such as
DETERMINANT, had been implemented using side effects in both
EM-FUN, but the primitive itself had no externally visible side effects. Whenever that was the case, I changed Siskind's code so both
EM-FUNwould use the same implementation of the primitive. I never used side effects within a procedure unless Siskind's code for that procedure had used side effects. None of the non-primitive procedures in
EM-FUNuse side effects.
The cleaned-up version of the functional benchmark still performs about 40% more calls than the imperative version, but is only 5 to 10 per cent slower. I also benchmarked a version that uses hygienic macros to ensure that the most heavily used functions would be inlined (regardless of implementation), but this did not change the relative performance and had little impact on the absolute performance. The following table shows the relative speeds of the benchmarks in four implementations of Scheme, normalized to 100 for the original imperative version.
original cleaned-up with macros EM-IMP EM-FUN EM-IMP EM-FUN EM-IMP EM-FUN Chez Scheme v5.0a 100 42 101 91 106 99 Larceny v0.32 100 43 95 90 101 94 Gambit v2.7 100 38 99 89 MacScheme v4.2 100 49 101 97 103 98
MacScheme was run on an Apple PowerBook 170, which uses a 33 MHz Motorola 68030 with a 68882 floating point coprocessor. The other implementations were run on a SPARC Ultra 1. CPU time was reproducible to within about 5 per cent. The source code and procedure calling profiles are available online.
These results support Siskind's claim that no Scheme compiler generates as good code for functional programs as for imperative programs. They also show that the difference is substantially smaller than Siskind's original benchmarks appeared to suggest.
William D Clinger
Last updated 23 April 1998.