There have been various posts to reddit, and various discussions in the last few days of slides from talks I've given on Supero. It's nice to see that people are interested, but some people have got slightly wrong impressions based on either the fact that talks don't include the bit I say, or that the material has moved on since the talk was written. I wasn't going to post my revised paper (currently undergoing refereeing) - but since it addresses both those issues perfectly, I decided to put in on the project web page:
I'm also going to address some of the questions people might ask:
It is possible to write charcount 8 times faster with mmap
Yes. You are correct. It may even be quite a bit more than 8 times. However, the benchmark controls for this - all programs only use getchar. The idea is not to write the worlds fastest letter counting program, either use hFileSize or Data.ByteString for that. The idea is to have a benchmark with computation which has abstracted away the non-computation bit until C and Haskell can be compared as directly as possible.
It's also a useful micro-benchmark. It's not too long and complex. It's a real task that people can wrap their heads around. There is a nice gentle progression between char/line/word counting. There are lots of good reasons for picking these benchmarks, but don't think of them as being too important - think of them as "ah, that's cute".
What's new with this paper?
If you've read the last paper, or seen a talk I've given, there are major differences in this paper. The largest change is that instead of waving my hands and trying to distract people with shiny things when the topic of non-termination comes up, I now have a good answer. The answer is homeomorphic embedding. I got this technique from Peter Jonsson, who in turn got it off Morten Sørensen and Robert Glück. It turns out my supervisor also used it in his thesis!
There are other differences all over the place. I've managed to get results for most of the imaginary section of the nofib suite (12 benchmarks). I've also come up with a new generalisation strategy, have a better story of what to do with let's, and have tried to make my optimisation better defined.
I want the the compilation times to be faster, the results to be better and more results on more benchmarks. Compilation times are between 10 seconds and 5 minutes, on a normal computer. Unfortunately my home machine isn't normal, it's 8 years old. I am hoping to improve the compilation speed to about the 2 second level, then I am going to do lots more benchmarks and investigate those where the optimisation doesn't do as well as I was hoping.