This page collects a large set of relevant papers on the subject of compiling call-by-value functional programming languages — specifically papers relevant to the topics we'll be covering in this semester's class. Besides bibliographic citations, I've also included short descriptions of the papers, including my personal editorialising. Note that this list is idiosyncratic — the papers that I, personally, have read and which have influenced my understanding of this area. This means that there are clearly many other uncited but relevant works not represented here, due to my bounded exposure to the field.
I've made no effort to keep this list manageable. You may browse among these papers as your interests direct you. For your future reading, I've also provided the entire collection as a single tar ball
These works treat the λ calculus in general.
The Lambda Calculus: Its Syntax and Semantics.
Hendrik Pieter Barendregt.
Studies in Logic and the Foundations of Mathematics, Vol. 103.
1985. North-Holland, Amsterdam.
[The comprehensive, magisterial text on the lambda calculus.]
Types and Programming Languages
Benjamin Pierce
MIT Press, February 1, 2002
[TAPL is the standard text on types in programming languages, suitable for a young graduate student or an aggressive undergraduate. It is friendly and very readable. Alas, the chapter specifically covering the centrally important Hindley-Milner polymorphic type system is not the best part of the book. Pierce's text has a very nice secondary benefit: even if you (perversely) had no interest in types at all, you would have a hard time finding a better basic introduction to using small-step operational semantics to describe and reason about programming languages. You cannot consider yourself a serious student of programming languages without having read this text.]
Term Rewriting and All That.
Franz Baader and Tobias Nipkow.
Cambridge University Press, 1998.
[If you want to do serious technical work on the lambda calculus, the basic mathematics and conceptual vocabulary comes from the field of "term-rewriting systems." This book is a nice, self-contained text on the topic.]
At the risk of overloading you, for a rigorous text on the subject of lambda calculus in the context of programming languages, you would do well to consider Bob Harper's general text on PL, Practical Foundations for Programming, and Matthias Felleisen et al.'s book Semantics Engineering with PLT Redex. Don't think of the latter as an intro to Racket's Redex tool. The first part of the book is essentially Matthias as a grown professor tuning up his dissertation with the benefit of several decades of worth of follow-on context and hindsight: the intellectual framework within which much work on PL happens today. The back half of the text is a series of studies showing how to use this framework to model programming languages.
Competent PL people who work in functional languages know this stuff.
Lambda, the ultimate declarative.
Guy L. Steele Jr.
1976. AI Memo 379. Artificial Intelligence Laboratory,
Massachusetts Institute of Technology, Cambridge, Massachusetts.
[Steele's original paper articulating the CPS-as-IR thesis, which led directly to the Rabbit compiler.]
RABBIT: A Compiler for SCHEME.
Guy L. Steele Jr.
1978. Master’s thesis.
Artificial Intelligence Laboratory, Massachusetts Institute of
Technology, Cambridge, Massachusetts. Technical report AI-TR-474.
[Steele's seminal CPS-based compiler.]
ORBIT: An Optimizing Compiler for Scheme.
David Kranz.
Ph.D. dissertation, Yale University, February 1988. Research Report 632,
Department of Computer Science.
[The first seriously engineered, industrial-strength, highly-optimising CPS-based compiler. Designed and implemented by a team; Kranz's dissertation is the best single description of the compiler internals.]
Compilation by Program Transformation.
Richard Kelsey.
Ph.D. dissertation, Yale University, May 1989. Research Report 702,
Department of Computer Science.
[Kelsey's dissertation is an alternate CPS-based compiler that grew out of the T / Orbit system. Notable for staying with a lambda-based IR essentially all the way to the assembly level. Influenced the original (pre-Flint) SML/NJ compiler.]
Compiling with Continuations
Appel, Andrew
Cambridge University Press, 1991.
[Appel's book gives a comprehensive tour of the SML/NJ compiler developed by David MacQueen's research group at Bell Labs with Appel at nearby Princeton. The compiler was the most significant lambda-calculus compiler in the world during the 90's. It was influenced by Kelsey's "Transformational Compiler," CPS based, famously used no stack, and showed the optimisation opportunities afforded by static type systems when compiling functional programming languages.
The book describes a snapshot of the compiler as it existed in the early 90's. It was subsequently shifted to an explicitly typed intermediate representation, FLINT, which is quite different.]
Representing control: A study of the CPS transformation.
Olivier Danvy and Andrzej Filinski.
1992. Mathematical Structures in Computer Science 2, 4 (1992), 361–391.
https://doi.org/10.1017/S0960129500001535
No-Brainer CPS conversion (Functional Pearl).
Milo Davis, William Meehan, and Olin Shivers.
Proceedings of the ACM on Programming Languages (PACMPL),
1(ICFP):23:1-23:25, September, 2017.
[An ICFP "Functional Pearl" deriving a CPS converter that directly produces simplified results in linear time. Introduces the term "no-brainer redexes" (a new name for a generally understood concept), and develops the basic mathematics of the no-brainer reduction system. Skip the math and it serves as a standalone introduction to CPS conversion.]
Definitional interpreters for higher-order programming languages.
John C. Reynolds.
1972. In Proceedings of 25th National ACM Conference. 717–740.
Reprinted in LISP and Symbolic Computation, 11 363–397 (1998).
[Contains the insight that writing a meta-circular interpreter in CPS fixes the evaluation protocol (call-by-value vs. call-by-name). Contains the original insight driving the entire idea of defunctionalisation. In general, if you have a great idea in this space, it was probably covered by a Reynolds' paper that was written before you were born. Make your peace with this fact, then go read his work and see if there other good ideas you could mine out. ]
The discoveries of continuations.
John C. Reynolds.
LISP and Symbolic Computation 6, 3/4 (Dec. 1993), 233–247.
Special Issue on Continuations (Part I).
[See above, on John Reynolds.]
Compiling with Continuations, Continued
Andrew Kennedy
ICFP’07, October 1–3, 2007, Freiburg, Germany.
[A "modern" discussion of the use of CPS in a compiler, written by someone who understands both CPS and ANF well. Lovely, well-written paper.]
Bottom-up β-reduction: Uplinks and λ-DAGs
Olin Shivers and Mitchell Wand.
Fundamenta Informaticae, 103(1-4) pages 247-287, 2010, IOS Press.
[A simple data structure for terms in the λ-calculus that has a simple, fast algorithm for doing β-reduction without term blowup. The PDF is the freely-available tech-report from the University of Aarhus, not the cited journal article.]
Shrinking lambda expressions in linear time.
Andrew W. Appel and Trevor Jim.
Journal of Functional Programming, 7(5):515–540, 1997.
[Doing simple, local optimisations on lambda terms in a simple way. This is stuff that all functional-language compilers do, pulled out and presented in a nice way.]
Reasoning about programs in continuation-passing style.
Amr Sabry and Matthias Felleisen.
LISP and Symbolic Computation 6, 3/4 (Dec. 1993), 289–360.
Special Issue on Continuations (Part I).
[The title says "CPS," but the paper is really about Sabry & Felleisen's ANF intermediate representation.]
Data Flow Analysis of Computer Programs.
Matthew S. Hecht.
American Elsevier, New York, 1977.
[Out of print. Easy to read, comprehensive text on the foundations and applications of classic data-flow analysis.]
Closure conversion is the task of encoding environment structure in a higher-order language with explicit data structures --- that is, how do we create the "closures" that package up free-variable bindings at run time? How can we do it in the most lightweight way possible? This is the central task of compiling a call-by-value, higher-order functional programming language based on the λ-calculus.
Appel's book on the SML/NJ compiler (cited above) is an excellent front-to-back introduction to the task.
See the references below on defunctionalisation and monomorphisation for other excellent papers on the subject.
These are big hammers for dealing with higher-order languages. Note that you can even do monomorphisation in dynamically typed languages like Scheme; see Siskind's work on Stalin, below.
@inproceedings{DBLP:conf/esop/CejtinJW00,
author = {Henry Cejtin and
Suresh Jagannathan and
Stephen Weeks},
editor = {Gert Smolka},
title = {Flow-Directed Closure Conversion for Typed Languages},
booktitle = {Programming Languages and Systems, 9th European Symposium on Programming,
{ESOP} 2000, Held as Part of the European Joint Conferences on the
Theory and Practice of Software, {ETAPS} 2000, Berlin, Germany, March
25 - April 2, 2000, Proceedings},
series = {Lecture Notes in Computer Science},
volume = {1782},
pages = {56--71},
publisher = {Springer},
year = {2000},
url = {https://doi.org/10.1007/3-540-46425-5\_4},
doi = {10.1007/3-540-46425-5\_4},
timestamp = {Tue, 14 May 2019 10:00:41 +0200},
biburl = {https://dblp.org/rec/conf/esop/CejtinJW00.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Jeffery Siskind, 2000
Siskind developed the same core set of ideas as Mlton for his highly optimising Scheme compiler, Stalin. I do not have a clear understanding of the idea-flow between the two projects -- although the Mlton and Stalin principals were all at NEC's research laboratory in the 90's, so there must have been some.
Pulls all the tricks together required for compiling to C.
@article{tarditi1992no,
title={No assembly required: Compiling Standard ML to C},
author={Tarditi, David and Lee, Peter and Acharya, Anurag},
journal={ACM Letters on Programming Languages and Systems (LOPLAS)},
volume={1},
number={2},
pages={161--177},
year={1992},
publisher={ACM New York, NY, USA}
}