Haskell vs. Lisp (Tail-Recursion)

Some Haskell fans seem impressed with better performance for a fibonacci function compared with similar implementations in Ruby and Python. They should be. I may be turning into a Haskell fan myself actually. Here’s why …
Read this and this before going on.
Impressive. Yea I thought so too until I implemented the same fib function in Lisp and ran it on SBCL.
Here it is…

(defun print-fib-trec-naive (start end)
  (format t "n=~D => ~D~%" start (fib-trec start))
  (if (< start end)
      (print-fib-trec-naive (+ start 1) end))))
(defun fib-trec-naive (n)
  "Naive tail-recursive Fibonacci number function"
  (if (or (= n 0) (= n 1))
      (+ (fib-trec-naive (- n 1)) (fib-trec-naive (- n 2)))))

and here are comparable results for N=45.

* (time (print-fib-trec-naive 0 45))
n=43 => 433494437
n=44 => 701408733
n=45 => 1134903170
Evaluation took:
  0.001 seconds of real time
  8.66e-4 seconds of user run time
  4.4e-5 seconds of system run time

My machine is a core2 duo macbook. This runs on a single core.
Now it gets even better. Here are the results for N=4500

* (time (print-fib-trec-naive 0 4500))
Evaluation took:
  8.698 seconds of real time
  4.731282 seconds of user run time
  0.361252 seconds of system run time

Yes it only took under 9 seconds running in slime/emacs. Would probably be closer to 8 seconds on the command line.
Don : You are completely right that using multiple cores requires minimal code changes in Haskell.
Very nice !
BTW – welcome to the NW !
The Haskell version is faster than a C version.
It’s also more readable beyond a trivial C implementation like this —>

#include <stdio.h>
unsigned long long fib_trec(unsigned long long n) {
  if (n == 0 || n == 1) return n;
  return fib_trec((n - 1)) + fib_trec((n - 2));
int main() {
  unsigned long long i;    // assuming 64-bit machine
  for (i=0; i <= 45; i++) {
    printf("n=%lld => %lld\n", i, fib_trec(i));
  return 0;
$ time bin/fib-naive
real    0m40.668s
user    0m40.238s
sys     0m0.103s

But why is the Lisp version so much faster than Haskell and even C ?!?
Many Common Lisp compilers implement tail-call optimization(which is not required by the spec) which is useful. It seems SBCL’s compiler which implements tail-call optimization additionally optimizes the algorithm. In this case the original time O(n^2) becomes O(n). Whether it’s implemented naively or with an improved algorithm(shown below) the end result is the same. Personally I’d prefer that this type of optimization be made more explicit or better yet just not implemented.
The equivalent optimized algorithm in Lisp :

(defun print-fib-trec (start end)
  (format t "n=~D => ~D~%" start (fib-trec start))
  (if (< start end)
      (print-fib-trec (+ start 1) end)))
(defun fib-trec (n)
  "Tail-recursive Fibonacci number function"
  (labels ((calc-fib (n a b)
             (if (= n 0)
                 (calc-fib (- n 1) b (+ a b)))))
    (calc-fib n 0 1)))

Anyone care to post a tail-recursive Haskell version ?