CS 6983 Special Topics: The Technology of Lambda -*- Outline -*- Olin Shivers Homework 4: Closure conversion part I: analysis Due 2025/3/22 midnight by email Closure conversion is the heart of our compiler, where we incarnate higher-order functions (a lovely, language-level fantasy) as closures made of code, memory blocks and registers (a gritty, machine reality). In this assignment, you will do the analysis to compute the information needed for the actual conversion. The results of your analysis will be recorded in annotations on the code; in the next assignment, you will write the actual conversion as a simple tree-walk of the AST that is driven by these annotations. This all has two consequences: - This is the difficult half of closure conversion -- figuring out what we know about the program and making all the important decisions. The actual transform will be fairly straightforward. So be prepared for some work, both conceptual and hacking. - Our program terms are going to get quite annotation-heavy. ------------------------------------------------------------------------------- * 0. Preliminaries ------------------ ** Sets everywhere ================== This assignment does a fair amount of set manipulation -- union, subtraction, membership, subset. You may want to track down or implement a basic set-algebra package to use. SRFI-1 has a set of functions that might be useful; you may find more sophisticated packages if you look around. Clearly, you can get big speedups by representing your sets as bit vectors, but this requires more engineering than is merited in your prototype compiler. I recommend you keep it simple. ** Lambdas, LETs, CONTs and FUNs ================================ When we say "lambda," we mean both FUN and CONT terms in a program. When we say a variable is "LET-bound" to some argument right-hand side (RHS), we mean this as a convenient short-hand way to speak of the corresponding beta redex. Bear in mind that there are two such redexes: one where the binding lambda is an open FUN term, and one where the binding lambda is an open CONT term: ((cont (f) ... f ... f ...) ; (let ((f )) ) ; ... f ... f ...) ((fun (f k) ; (let ((f ) ... f ... k ... f ... k) ; (k )) ; ... f ... k ) ; ... f ... k) (Remember that only FUN forms can bind a continuation. CONTs don't take continuation arguments.) ** Scope and free vars ====================== Before you start this assignment, reflect for a moment on the relationship between the set of variables in-scope at program point p, InScope_p, and the set of variables in the free-variable set at program point p, FreeVars_p: the latter is a subset of the former. That is, there's a producer/consumer relation between InScope_p and FreeVars_p: InScope_p is the set of variables the program *makes available* to point p, while FreeVars_p is the subset of these variables that p *actually uses*. The *language semantics* say the term at point p can, if it wishes, access anything in InScope_p. But the *compiler* is only obligated to make the bindings of FreeVars_p available to p -- which might be quite a savings. For example, suppose we have a pure combinator, say, the function-doubler (fun (f x k) (f x (cont (temp) (f temp k)))) This piece of code might appear in a *context* with a thousand variables in scope. But it doesn't *need* any of them. Keep this InScope / FreeVar distinction in mind when you get to the extended-variables analysis in step N of this assignment. ** Annotating forms =================== You'll be collecting, using, and sprinkling lots of information on various pieces of your program. That, in fact, is the entire point of this assignment. If we were programming in OCaml or SML, we'd simply make these annotations mutable fields in the records that we use to represent nodes in the CPS AST. But s-expressions are nice, because they can be easily pretty printed. (If you don't know how to "pretty print" an s-expression, look it up in the Racket documentation.) This compiler has been designed so that we only need to annotate binding forms -- things that bind variables: FUN, CONT and LETREC forms. To accomodate our wish to have a (reasonably) readable IR, we'll collect all the annotations for a term together in a sublist marked with an "@". As you proceed through the analyses in this assignment, you'll keep accumulating additional annotations to add to the grammar you use in nanopass for annotations. Here are some examples: (fun (x k) ;; Initial form after parameters is the annotations set: (@ (kind first-order) ; This FUN is a first-order lambda. (label fun38) ; Its global label is FUN38. (free-vars b y)) ; It contains free references to B & Y. (if b (k 0) ; Here's the body ($+ x y k))) ; of the FUN. (letrec ((f (fun (n k1) ...)) ; bindings (g (fun (a b k2) ... z ...))) (@ (label lr81) ; annotations (free-vars x z ktop)) (f x ktop)) ; body You'll need to engineer up ways to query, delete, add, and update the annotations on a term. To convert back to your pre-analysis form, you just drop the (@ ...) subforms everywhere. ------------------------------------------------------------------------------- * 1. Static, global names for binding forms ------------------------------------------- Write a pass that assigns a globally unique name (called its "label") to every lambda and LETREC term in your program. These names will be useful later when we want to globally refer to different lambdas. Labels are drawn from a flat, global namespace that is completely separate from the lexically scoped namespace used for source variables; this is why a code label must be unique across the entire program. The label of a lambda or a LETREC is given by the (label ) annotation, e.g. (fun (x k) (@ ... (label fun37) ...) ; This FUN is labelled FUN27. ($+ x x k)) In our AST, a label is a symbol. ------------------------------------------------------------------------------- * 2. Free-variable analysis --------------------------- Write an analysis that annotates each lambda with its set of free variables, e.g. (fun (x k) (@ (free-vars y z)) (if y ($+ x y k) (k z))) You could, in fact, interleave this with the next two passes (kind classification and first-order CFA)... but you might want to keep it a separate pass just to keep the details separate to make them easier to code and debug. And it's a good warmup analysis: a simple tree walk, pretty easy. ------------------------------------------------------------------------------- * 3. Classification of lambdas ------------------------------ Write a pass in your compiler that will analyse a CPS term and classify (by annotation) every lambda (FUN or CONT) as one of the following three kinds: - Open An "open" lambda is a term that will get executed exactly where it appears in the code, so that + Its evaluation environment and its application environment match exactly. + It is called from exactly one, statically known, call site. + That call site calls no other lambda. There are two kinds of open lambdas: 1. A lambda appearing as the function part of a beta redex, e.g. ((cont (f) ... f ... f ...) ; This CONT is an open lambda. (fun (x k) ...)) ; This FUN is not. Such lambdas are basically LETs. Invariants: - Any such LET binds its LHS variable to a lambda RHS. Why? If the variable was bound to a constant or variable RHS, your simplifier would have reduced it away. - The variable bound by any such LET has more than one reference. If it only had 0 or 1 reference, your simplifier would have reduced the redex away. - So these LETs bind a *lambda* to a variable that has *multiple references*. In other words, this kind of open lambda always binds/names a "join point" -- code to which control is transferred from two or more spots. These join-point LETs allow us to encode a control-flow graph that is a DAG; without them we can just code up a tree using IFs as code splits. Want cycles in the CFG, too? Then we have to use LETRECs. Here's another way to look at these "join" LETs. An open lambda don't really do any computation. It just introduces a local name for a local thing. Here we are exploiting the property of names to give us the ability to refer to something multiple times: just give it a name, and use the name twice. 2. A CONT form appearing as the continuation of a primop call, e.g. ($+ x y (cont (z) ...)) - First-order A "first-order" lambda is one that is LET- or LETREC-bound to a variable that is only referenced in the *function position* of call forms. E.g., ((cont (sqr) (sqr 3 (cont (a) (sqr 5 (cont (b) ($+ a b k)))))) (fun (x k') ($* x x k'))) ; This FUN is first-order & bound to SQR. Given this definition, a first-order lambda has these properties: + It is only called from statically known call sites. + These call sites only call this one lambda. + All calls to this lambda occur in a lexical context that sees all variables that are in scope for the lambda, with the same bound values. A first-order lambda is only bound to *one* variable, ever, and that one variable is only bound to that lambda. So we can think of every first-order function as having a unique name -- that one variable to which it is bound. They are paired together. A first-order lambda has the nice property that every call to it is statically known and *only* calls that lambda, so we can tailor these calls *specifically* for that one lambda. This is pretty liberating, as we'll see. - Closed A "closed" lambda is anything not open or first-order. Essentially, it is a lambda that will need to have a closure created each time it is evaluated. If it's a FUN, the closure will be allocated on the heap; if it's a CONT, the closure will be allocated in the stack frame that is the current, top frame at the time the closure is created. What's good about a closed lambda is that evaluating such a lambda makes a real *value* that we can pass around the program, put onto lists, throw into hash tables, etc. (For comparison, look at the rules for first-order lambdas -- they cannot be used / are never used as actual values.) What's bad is the cost of this extra utility: (1) we have to allocate storage for the closure at run time, every time the program evaluates the lambda, and (2) we will have to call these functions using a general-purpose, globally fixed, heavyweight calling protocol. (Could we do better? If we had, say, higher-order flow analysis and built lambda/call webs? Stay tuned...) Why are "first-order" lambdas called first-order? Because they correspond to the simple skeleton of code that we could write in a first-order programming language like Pascal or Algol. You can name functions, but only use those names as the function part of a function call -- which provides lots of useful constraints we can exploit when compiling the code. First-order and open lambdas, taken together, give us the control structure that C, Pascal, and Fortran compilers call the "control-flow graph" (CFG) of a given function. That is, in CPS, loops, conditional splits, joins and straight-line basic blocks are all provided by open and first-order lambdas: - Basic blocks are sequences of primops chained together by open continuations. - Loops are made with LETREC-bound first-order functions. - Join points are made with LET-bound first-order continuations. - OK, we cheated on conditional splits in our IR, with a core IF syntax form, to keep things simple, but it can be easily managed with multi-continuation conditional primops. Take a moment and reflect on all the different things we are saying when we say a lambda is first-order: - Control: It's only called in tightly controlled ways, from compiler-known places. - Environment: There's a very strong relationship between the environment where it is evaluated and the one where it is invoked. - Data: No closure needed. Its procedures are *not run-time data*. We've tightly constrained *all three* fundamental program structures involved with this lambda. Which just goes to show you how powerful the general mechanism of lambda is -- because the lambda that the programmer gets is a lot more than just first-order lambdas. You record your classification by adding a KIND annotation to the lambda. For example, (fun (x k) ; Evaluating this lambda will allocate a (@ (kind closed)) ; closure on the heap, for COEFF & the code. ($* x coeff k)) (cont (a) ; A return point for a function call. (@ (kind closed)) ; Free vars M-1 & K will be kept on the stack ($* m-1 a k)) ; frame of the FUN to which we belong. (cont (x) ; We jump here from two or more (@ (kind first-order)) ; known places: a CFG join point. ($+ 1 x k)) Additionally, when you find a first-order lambda L that is LET- or LETREC-bound to some variable V -- so that V is either the parameter of an open lambda or LETREC binder B -- your analysis should mark V as being bound to a first-order lambda by putting it in a FIRST-ORDER-VARS annotation on B. For example: (fun (j k) ; Slow even test for natural numbers. (@ (kind closed)) (letrec ((even? (fun (n k1) ; Only called => first-order ($zero? n (cont (b) (if b (k1 #t) ($- n 1 (cont (m) (odd? m k1)))))))) (odd? (fun (p k2) ; Only called => first-order ($zero? p (cont (b) (if b (k2 #f) ($- p 1 (cont (q) (even? q k2))))))))) (@ (first-order-vars even? odd?)) ; Both of this LETREC's (even? j k)) ; vars are 1st order. This section is described separately from the following control-flow analysis in the next section, but you should combine the two into a single unified pass. They are really a single algorithm that produces multiple distinct (but related) annotations. ------------------------------------------------------------------------------- * 4. First-order control-flow analysis -------------------------------------- For every first-order lambda, find all the places it could be called. If we had assigned labels to all the call sites in the program, we could simply annotate each first-order lambda with the labels of those calls. Instead, we'll work in terms of *binder form* labels, as follows. If we start with any call expression in the program and start moving up through the AST, we will ascend through a sequence of IF terms until we eventually arrive at a FUN, CONT or LETREC -- our three binding forms. These binding forms *are* labelled. So we'll use these. Let's call the innermost binding form that contains a call C its "parent binder." That's the view looking up the AST. Another way to put this is the downward view: the CPS grammar says that the body of a binding form is a binary tree of IF's whose leaves are either (1) LETREC forms or (2) calls. (That tree could, of course, be a simple leaf, no IF.) (Note that if we'd gone ahead and represented conditionals using calls to primops that take multiple continuations, we could say the above more simply: the *immediate* parent of every call expression is either a lambda or a LETREC. We wouldn't have to deal with the tree-of-ifs structure.) First, write a pass that will add a (@ ... (parent-binder ) ...) annotation to every binding form except the root of the AST. This annotation gives, for binding form B, the label of the innermost binding form that encloses B. We could use these links to hop up the AST from binding form to binding form, all the way up to the root of the tree. The parent binder of any lambda that is a RHS in a set of LETREC bindings is that LETREC. Now we can implement a pass to link first-order lambdas to their callers: If we want to record that lambda L is called from some call C, we add the label of C's *parent binder* to L's CALLERS list. As we'll see, that will be enough for our future analytic needs. For example, suppose we want to record that the lambda with label FUN92 is called from the (F 18 K) call in this code: (cont (x) (@ (kind closed) (label c9) (parent-binder lr22)) (f 18 k)) ; This call jumps to the lambda with label FUN92. Then we mark lambda FUN92 this way: (fun (y k2) (@ (kind first-order) (label fun92) ; The "name" of this FUN (callers c9 fun31 lr18)) ; Fun92 is called from 3 places. ($- 0 y k2)) ; The body of the lambda The C9 on FUN92's CALLERS annotation says that FUN92 is called from the body of CONT C9. (Note: the CALLERS list for every first-order lambda should have more than one entry. Otherwise, your simplifier would have reduced the lambda away.) We only annotate first-order lambdas with their CALLERS set. This is because: - That's all we're going to need for the compiler as it's currently structured. - It's easy to compute the CALLERS set of first-order lambdas -- a simple, linear-time, one-pass walk of the AST. For closure lambdas, we'd need higher-order flow analysis. (Stay tuned.) So sticking to first-order lambdas is the sweet spot. You can combine your open/first-order/closed KIND analysis and your CALLERS analysis into one unified pass, which works as follows. First, note that a first-order lambda L is always paired with some source variable V; V is bound to L by either a LET or a LETREC form. In the discussion below, we'll refer both to V and L as "first-order," interchangeably: a "first-order variable" is one that is bound to a first-order lambda. The analysis walks the AST recursively, carrying a symbol table (that is, a dictionary) with an entry for every in-scope source variable that *might be* bound to a first-order lambda. Again, the symbol table doesn't have entries for *all* the variables in scope, just the ones that are still currently first-order candidates -- the ones that haven't yet been discovered *not* to be first-order. Once we find some reason a variable can't be a first-order variable, we remove it from the "candidates" table and keep walking the tree. This means the "current symbol table" is passed down *and* up in the tree as the analysis walks the AST: it's threaded through the tree walk. - When the analysis finds a LET (that is, a beta redex) or a LETREC, it adds the bound variables to the current table as new candidates, with empty caller lists, and recurs down into the body of the binding form. In the case of a LETREC, we also walk all the RHS lambdas that it binds. - If the analysis chances upon a reference to a candidate variable V (that is, one with an entry in the current symbol table), and that V reference is in the *argument* position of a call... then we have just discovered that V is not a first-order variable. So remove it from the candidates symbol table, and keep walking the tree. - But if a candidate variable V appears in the *function* position of a call... then we have just discovered another call site for V (a direct jump to its bound lambda, that is). So add the call to V's entry in the symbol table (that is, add the label of the call's "parent binder" in the AST to the symbol table), and keep walking the tree. - When the analysis is done walking a binding form (FUN, CONT, or LETREC), it removes the variables bound by that form from the symbol table before returning upward in its tree walk. - When the analysis has walked a LET or LETREC, before returning to the parent of the LET / LETREC, check the symbol table. Any variables bound by this LET or LETREC that still have entries in the symbol table are first-order! Grab the callers list from the symbol-table entry, and annotate the associated lambda as (a) first-order with (b) the given callers set. Mark the variable as first-order by adding it to the FIRST-ORDER-VARS annotation of its binder (which is an open lambda in the beta-redex representing a LET, or a LETREC). - Handling IF should be straightforward. - Think through how you walk the entirety of a LETREC -- you must examine not just its body, but all of its RHS lambdas. - There is a minor special case for continuations. Suppose a CONT form C is let-bound to a variable V. Again, the general rule is that + directly calling the continuation (V ) doesn't rule out the V / C pair as 1st-order, but + using V as an *argument* of a call *does* rule it out -- C has to marked as a "closure" form. + However... using V as the argument of a *primop* call is considered a direct call: ($+ x 1 V). Such a use of V does *not* force C to be a "closure" form. + No primops take user-function arguments, so we don't have to worry about this case. It only arises for continuation arguments. Note that you can use a single mutable dictionary (e.g., a hash table) instead of threading a persistent dictionary (e.g., an alist or a red-black tree) through the recursion. Either way works fine. ------------------------------------------------------------------------------- * 5. Environment analysis ------------------------- The point of closure conversion is deciding how environment structure will be implemented. This pass is the heart of that work. In this analysis, we determine the key issue of what the total variable "needs" of a lambda are, an extension of the idea of "free variables." The big idea is that when we closure-convert our program, we will transform the code so that all first-order lambdas are passed their free variables from their call sites as extra parameters -- which means that, after closure conversion, no first-order lambda will have any free variables. Closure lambdas (after conversion) won't have free variables, either -- all their free-variable references will be turned into loads from their closure record, which will be passed to these lambdas by the caller. Only open lambdas will have free variables after conversion, and they don't really count, being inlined where they appear. When we're done with the (forthcoming) transform, every variable is just a register. (Well, a virtual register -- we'll pack these down to physical registers later.) (Wait, won't passing all these extra parameters around... and around... from call to call to call involve a lot of work? No. Again, post-closure-conversion, all these variables are just registers. So "passing them as parameters" in first-order function calls are just register/register copies. These "copies" are really just coalescing requests to the register allocator.) But if we unpack this idea a little bit, we discover that a given lambda can require access to more of its in-scope variables than just the ones in its free-variable set. This is due to the way we're going to implement first-order lambdas when we convert them. To motivate the idea of "total needs," let's consider an example, where we call a first-order lambda that itself calls another first-order lambda. Suppose call-site C1 calls first-order lambda L1, and inside L1's body, there is a call C2 to *another* first-order lambda L2: C1: (f x y k1) ; Call C1 jumps to F, which is let-bound to 1st-order L1. L1: (fun (a b k2) ; First-order lambda L1 has free vars G, C, Z (if z ; Here's a free ref to Z. C2:(g b c k2) ; Call C2 jumps to G (which was let-bound to L2) ...)) L2: (fun (p q r k4) ; Here's first-order lambda L2, which ... s ... w ...) ; contains references to free vars S & W. Since L1 is a first-order function, we are going to transform it and its call sites (when we do closure conversion), so that it is passed its free variables (G, C, Z) as extra parameters. Except we don't have to pass G, right? G is a first-order function, so we won't need to have G passed around as a *value*. We'll just compile its use in call C2:(G B C K2) as a *direct* jump to a *known* code location. So, we just need for C1 to pass L1 the extra parameters C & K3, right? Not so fast. L1 needs more -- because L1 has to pass to L2 (from call C2, inside L1) the free variables that L2 needs. That's the contract -- L2 is a first-order lambda, so it is going to be passed its free variables S & W from its call site, C2. But that means that L1 needs access to S & W itself, so it can, in turn, pass them to along to L2. Bottom line: C1 needs to pass to L1 the variables C & Z (as L1 needs these) *and* the variables S & W (so that L1 will have them to pass on to L2). Put another way: we don't want to pass *every* in-scope variable visible at C1 to L1. That is... we *could* do it, but why bother? L1 probably doesn't need all of them. We just want to pass to it the variable bindings that it needs: its free variables, which are a subset of its in-scope variables. That's easier and less work. But this subset of the in-scope variables that C1 needs to supply to L1 *transitively* includes the needs of all the first-order functions L1 calls, even though L1 doesn't use them directly. So we need the transitive closure of these "what we need" sets. We want some way of saying that the call site requires / uses / needs these extra variables. That's the EXTENDED set for a lambda -- its a subset of its in-scope variables... but possibly a super-set of its free variables. Every lambda and LETREC L is annotated with a set of variables called its "extended free variables." These are (a) its free variables, plus (b) the variables that are *also* needed by L because they are needed by some first-order lambda L' that is *called from* L. Because LETREC lets us have cycles in our environment structure, these sets are mutually dependent; we want to find the smallest sets that satisfy these dependencies. That is to say, we are looking for the *least fixed point* of some generating function. To summarise: Binder L (a lambda or LETREC) needs - all the variables in its free-variable set -- except those bound to first-order lambdas -- and - all the variables needed by all the first-order lambdas called by L. (See how "needs" occurs recursively in this definition?) When we do simple free-var analysis, the rules of lexical scope say that sets propagate *upwards* in the AST. For example, the free-var sets of the test, consequent and alternate sub-terms of an IF form propagate up into the free-var set of the IF itself. Likewise, the free-var set of some argument in a call form propagates upward to the free-var set of the call itself. But when we consider the extra needs of a first-order lambda L, however, growth in this set of variables propagates from callee L *back* to each call C that calls L. When variable needs propagate from some first-order lambda L back to a call C that is the body of binder B, that increases B's "needs." (That is, it increases B's EXTENDED set.) - If B is a LETREC, these extra needs propagate up into B's parent LETREC or lambda in the AST. (How convenient that B has a label annotation allowing you to move up in the AST: this is why we have the PARENT-BINDER annotation.) - If B is a closure or open lambda, again, these extra needs likewise propagate *up* into B's parent-binder in the AST. - But if B is a first-order lambda... then these extra needs propagate *back* to each call site C that calls B, and from there up into C's containing binder. (Again, you have just the right annotation to help you navigate along this backwards control-flow path, right? These are just the binders in B's CALLERS annotation.) Notice that in the case of an open lambda, propagating to its lexical parent and its caller is exactly the same thing, so the two different propagations collapse together: the "back" link (control) is the same as the "up" link (environment). If you think about it, you'll realise that this "extended" free variable analysis is the same backwards flow analysis and the same problem as determining live-variable sets in the CFG of, say, a single function in a C compiler. This is where we address that issue, in our lambda-based compiler. You job here is to write an analysis that will take a program and determine the EXTENDED annotation for every binding form (lambda or LETREC) in the program. This analysis will hop around the AST in non-tree-walking ways, following callee/caller control links to propagate informations as EXTENDED sets grow. So, your analysis works like this: - Make a dictionary mapping labels to the AST nodes they name. This will allow you to hop around the program following various links. (In a more highly engineered compiler, you'd have real pointers in the AST nodes for these links.) - Make a set of all the first-order variables in the program (that is, all the variables that are bound to first-order lambdas). - Make another dictionary mapping the label of a binding form (lambda or LETREC) to the extended free-variable set for that form. The table should have an entry for every binding form in the program; initialise each entry to the simple free variable set for that form, minus the variables that are first-order. - Propagate information around the program until you reach a fixed point. This is *not* a simple tree-walk traversal of the AST. You can do this with a worklist algorithm, or with an equivalent recursive one. Every binder in your table should go on the initial worklist. (It's called a "work list," but it's really a *set* -- in this case, a set of labels for binders whose sets have grown and therefore for whom we need to propagate the possible consequences of that growth.) - Walk the AST installing the final extended free-variable sets for every binding form as an annotation. ------------------------------------------------------------------------------- * 6. Frame layout ----------------- The last thing we have to do before transforming our code is to lay out our stack frames. Our implementation will use a stack in more or less the same way that C code does. Depending on how you look at it, a stack frame is either - Where we close CONT lambdas (that is, put the values of their free variables), or - where we save variables that are live across function calls. (It's the same thing.) In this analysis, you will lay out the stack frames. These decisions will be recorded in some new annotations. Let's begin by defining our stack management policy. We say a CONT form "belongs" to the innermost first-order or closure user-lambda FUN that contains it. (Open FUNs don't count, as we don't allocate a new stack frame on entry to one.) Our model is that - Whenever control enters a user lambda (a FUN), we allocate (push) a fresh stack frame: sp -= . - Whenever a CONT closure is created, it is allocated on / inside that user-lambda's stack frame -- which is the one at the top of the stack. That is, we pick some unused slots on the current stack frame and save the CONT's free variables there; the CONT code is then compiled so that its free-variable refs are rendered as loads from the stack. It's the job of the CONT's caller (the function that returns to this continuation, that is) to pop the stack back to this frame before jumping to the CONT's code. Just as a heap-closure FUN loads its free variables from its closure record, the stack-closed CONT loads its free variables from the stack frame. - When a user lambda returns (whatever that means, precisely -- see below), its frame is popped: sp += . What precisely do we mean by "when a user lambda returns"? We mean one of three possibilities. Suppose we're talking about user lambda L = (fun (x k) ...) and L is a frame-allocating lambda (that is, 1st-order or closure, not open). Then we pop L's frame when 1. The code does a simple return, by which we mean, when it calls K: (K 36) ; Return 36 to L's caller. 2. The code does a tail call, of which there are two kinds: 2a. A user-function call with continuation K: (F 7 K) ; Let F do the work -- and return answer to L's caller. This gets at the idea that a tail-call is a pop-then-call -- We no longer need L's frame, and F should just return to L's caller, K. (This exploits a subtle invariant: K's frame is always the one *just before* the current L frame on the stack, so popping the current frame will get us back to K's frame.) 2b. A primop call with continuation K: ($* Y 3 K) ; Return Y*3 to L's caller. This is simply: do some primitive piece of work (multiply, in this example), and *then* return to K. So it's a return. The sign of a return is L's continuation parameter. When we talk about buried treasure, X marks the spot. But when it comes to returning and popping a frame, it's not X: K marks the spot. Again, the big idea is that when control enters a (closure or 1st-order) FUN, its prelude code bumps the stack *once* and allocates a frame large enough to serve the closure needs of *all* the closure continuations that "belong" to this user lambda. Note that the CONT forms belonging to user lambda L make a tree rooted at L, linked by PARENT-BINDER up-link annotations. To find the FUN that "owns" CONT C, just follow these up-links through intermediate containing LETREC and CONT forms until we get to a FUN form. How big is this frame? Your frame analysis will leave a FRAME-SIZE annotation on the FUN term to say: (fun (x y k) (@ ... (frame-size 5) ...) ; Need 5 slots (slots, not bytes) ) We also have a new annotation for closure CONT forms, that says what its incremental needs are, the FRAME+ annotation, which looks like this: (cont (x y) (@ ... (frame+ () ...) ...) ) The FRAME+ annotation specifies the variables that should be added to the current stack frame for this continuation's needs when it runs. Put another way: it is the set of (extended) free variables this continuation needs in its closure (i.e., on the current stack frame) that haven't *already* been put on the frame by some previous CONT form appearing in-between this one and the user-lambda FUN to which this one belongs. Let's call that chain of nested CONT forms our "predecessor" CONTs: it is the case that if control has gotten *here*, where we are evaluating this CONT form, then we've previously evaluated all of our predecessors, so all of the variables *they* put on the stack are *still there*. We don't need to save these variables a second time. We just need to save our *additional* needs. Let FP (for "frame plus") be the set of variables in a CONT's FRAME+ annotation. Let XF be the set of variables in a CONT's EXTENDED free-variable set. Our CONT is going to need all the variables in set XF saved on the stack -- but all the variables in set XF - FP are *already there*. We only need to save the variables in FP. We can find the FRAME+ vars for a continuation with a tree walk. As we walk the AST top-down, we'll keep track of CF, the "current frame" layout, which says what variables are saved on the stack... and where. So if CF = {x |-> 2, z |-> 5}, then the variable X is at slot #2, and Z is at slot #5 on the stack frame -- and slots 0, 1, 3, & 4 are currently unused. - A stack slot "var |-> num" dies and is removed from the stack map CF when we descend into a subtree of the AST that has no free reference to var. We no longer need the variable's binding, so this reclaims the slot for further use. - When we recur down into a 1st-order or closure FUN, we reset the frame map to the empty map {} -- we're starting a fresh frame, which will be laid out as the compiler walks the body of the FUN. (Open FUNs, by contrast, have no effect on the state of the stack.) - When we descend into a closure continuation C = (CONT ...), with frame map CF, we strip out any entries in CF not in C's extended free-var set -- as described above, those entries are no longer needed, so we can reuse their stack slots. Then we figure out which of C's free variables need to be added to the stack: XF - Domain(CF) That is, the FRAME+ set is all the variables C will need on the stack when execution eventually returns to it (that is, XF)... except for the variables *already* on the stack at closure time (that is, Domain(CF)). That's what the program will need to add to the stack when C is evaluated. These are the variables that you'll put in C's FRAME+ annotation. Of course, every variable in the FRAME+ set will also need to be assigned a free slot in the current stack frame. We'll just use the lowest slots (the smallest slot numbers) that are unused in CF. Put another way, for a given CONT form C, that occurs at a program point with current frame map CF, to get the frame map for C's body, Remove: Domain(CF) - XF <-- No longer needed Add: XF - Domain(CF) <-- This is C's FRAME+ set. - Then we recur into C's body, using the updated frame map. - As the analysis returns up through its tree walk of the AST, it passes back the maximum frame size needed while laying out the stack frame in a child of the current syntax node. The max of these high-water marks is the high-water mark for the current node. This is how we determine the FRAME-SIZE annotation for a given user lambda L: walk its body starting with a fresh, empty frame map CF, and use the high water mark returned by the analysis for L's body. - For purposes of eliminating "fence-post" errors, let's be clear about how we count slots, frame sizes and high-water marks. There are different conventions possible, but here's a consistent one: - Number stack slots starting with slot 0. - FRAME-SIZE is a *size*. A frame size of 0 means no frame needed. A frame size of 2 means two slots: slots #0 and #1. - So use the same convention for the high-water mark: size, not slot index. Thus stack map {x |-> 1, z |-> 4} should produce a high-water value of 5, not 4: you need a five-slot frame for this layout [- | x | - | - | z] 0 1 2 3 4 where - marks the three free/unused/dead slots. A small additional complexity: multiple variables bound by the same LETREC to closure lambdas (as opposed to first-order lambdas, which don't actually turn into values at run-time) only need *one* stack slot. (Because we only save one copy of their shared closure record on the stack. When we actually have a *reference* to any of these variables, we'll add the right offset to the record at the reference point.) You handle this by assigning these variables the *same* slot in the FRAME+ stack map. For example, in this map (frame+ (x 2) (f 4) (y 5) (g 4)) F and G are siblings bound in the same LETREC; what we'll store at slot #4 in the stack at run time will be a pointer to the shared closure record that was created for F & G. Whenever we need the actual value for F or G (to pass as an argument to another function, or call), we'll load the address of this record off the stack into a register and then immediately add the constant offset to it that's needed to make the address point to the slot in the closure record where F or G's code is stored. To handle this sharing, you'll have to adjust the dictionaries your analysis uses for the CF current-frame maps accordingly. Here's a marked-up example, with comments: (cont (x) (@ (kind closure) (label c39) (parent-binder fun49) ; C is a first-order var, (free-vars a c q) ; so it's not in the EXTENDED set. (extended a q v w) ; V, W not used locally; for passing on to C. (frame+ (q 2) ; A & V already on the stack; add Q & W (w 5))) ; to have all of EXTENDED set on frame. ) ; This code runs w/access to all of EXTENDED. Note that we can always figure out which of a term's free variables are first order: they are not included in the term's extended free variables set. So the first-order free variables are FV - XF where FV is the set of free variables for some binder, and XF is the set of its extended free variables. This is how you can tell that C is a first-order variable in the above example -- so we know that C is only *called* in the body, never used as a value. ------------------------------------------------------------------------------- * 7. Lambda chart ----------------- Remember that for every kind of lambda, there are three control points or times that matter: - the point where we *evaluate* the lambda (which is determined by the code where the lambda appears), - the point where we *jump to / call* the lambda (which is determined very differently for open, first-order and closure lambdas), and - the point when we *execute* the lambda (which is the code *of* the lambda -- its parameters & body). What we do at these points depends on the kind of lambda: {open,first-order,closed} x {FUN, CONT}. And much of this is, in turn, driven by what we know about 1. The relationship between the environment at the point of evaluation, and the point of call. E.g., in an open lambda, these two environments are identical; in a closed lambda, we assume no relationship at all. 2. The relationship between the state of the stack at the point of evaluation, and the point of the call. E.g., in a closed FUN, there's no useful connection; in a closed CONT, we ensure the stack is popped back to the same frame. Here's a chart. If you can understand the "why" of every entry in all six boxes, and see what the implementation implications are, you have attained CPS satori and there is reason to hope that you understand enough to write the analyses of this assignment correctly. Maybe. FUN CONT --- ---- Open Not used as a value Not used as a value & free vars available @ call & free vars available @ call => no closure on eval => no closure on eval => frame layout inside FUN => frame layout inside CONT is same as outside is same as outside => No FRAME+ annotation => No FRAME+ annotation Only action on entry Only action on entry to bind "reg" params to bind "reg" params Called immediately Called immediately => free vars all available => free vars all available Basically, the var & body Basically, a LET binding of a LET binding. or primop-call continuation. Invariant: FUN's cont param is closed on / uses *current* frame! (Do not pop on call...) 1st-order Not used as a value Not used as a value & free vars passed from caller & free vars passed from caller => no closure on eval => no closure on eval => eval adds nothing => eval adds nothing to the stack to the stack Entry starts fresh frame! => frame layout inside CONT => frame layout in FUN empty! is same as outside => No FRAME+ annotation => No FRAME+ annotation FUN's cont argument is closed on / uses immediately prior stack frame Closure Used as value, called from Used as value, called from arbitrary lexical context arbitrary lexical context => eval makes closure => eval makes closure on heap using stack frame Entry starts fresh frame! Entry resumes eval-time frame! => No FRAME+ annotation FRAME+ describes eval-time additions to frame for Fun's cont argument is closed closure. on / uses immediately Evaluating CONT, making closure prior frame. is just saving FRAME+ vars on current frame. Stack frame when inside CONT is stack frame outside CONT, *plus* new FRAME+ slots. ------------------------------------------------------------------------------- * 8. Annotations list --------------------- Here is the full set of annotations that are the results of your analyses. (KIND ) ; lambda - open, first-order or closed (LABEL ) ; binder - unique name (FIRST-ORDER-VARS var ...) ; binder - first-order vars bound by term (FREE-VARS var ...) ; binder - free vars referenced by term (EXTENDED var ...) ; binder - extended free vars (PARENT-BINDER label) ; binder - innermost containing binder (CALLERS label ...) ; lambda - parent binder of every call site (FRAME-SIZE ) ; First-order and closed FUN (FRAME+ ( ) ...) ; CONT - stack saves needed to make closure ------------------------------------------------------------------------------- * 9. Reflect ------------ Take a look at a fully annotated program produced by your analysis. Consider *just how much* information now decorates your code -- you now have everything marked that you'll need to translate the program down to a machine-level model where there are no lexically scoped, higher-order functions, only blocks of memory holding data and other blocks holding code. *All those decisions have been made.*