This project aims to illustrate two ways of transforming expressions into ANF.
One way is to define a function to check if an expression is in ANF and write
a function from expr to expr. The other way is to define a type aexpr
that constraints expressions, so that by definition they are in ANF form.
For this purpose, we consider that expressions are in ANF if they are of the following form:
<aexpr> := <cexpr>
| (let (<id> <cexpr>) <aexpr>)
<cexpr> := <immexpr>
| (<op1> <cexpr>)
| (<op2> <immexpr> <immexpr>)
<immexpr> := <id>
| <num>
| <bool>You will find the implementations in:
lib/anf.mlusing the existingexprtypelib/anf_type.mldefining a customaexprtype
You may notice that the function anf_imm takes a parameter
k : expr -> expr. The letter k stands for continuation, which you may
remember from the Programming Languages course. However, for this specific use,
there is a more intuitive way to think about this. A function of type
expr -> expr can be thought of as an expression with a hole, and applying
a function of this type can be thought of plugging the hole with the given
expression.
For example, consider the expression (+ (+ 1 2) 3). For this expression to
be in ANF, both operands have to be immediate expressions. To convert the
expression into ANF, the anf function calls the anf_imm function with
the following arguments:
expr = (+ 1 2)k = fun hole -> (+ hole 3)
as output, we get:
(let (tmp1 (+ 1 2))
(+ tmp1 3))There are multiple valid ways of defining the transformation of an expression into ANF. Therefore, some implementations may create more or less temporary variables.
For example, consider the following program:
(let (n (+ 2 (add1 3)))
(sub1 n))Both of the following transformations of that program are in ANF:
(let (tmp1 (add1 3))
(let (n (+ 2 tmp1)))
(sub1 n))(let (tmp1 2)
(let (tmp2 (add1 3))
(let (tmp3 (+ tmp1 tmp2)
(let (n tmp3)
(sub1 n))))))For each of the two ways of implementing the transformation, you will find a
Simple and a Refined module. The first attempts to be as simple as
possible, while the second attempts to avoid using unnecessary temporary
variables.
The refined version achieves this by splitting the anf_imm function into
two versions:
-
anf_immis used when we need to turn the expression into an immediate expression. -
anf_cis used when the expression only has to be a cexpr.
For a more detailed explanation of how to come up with the translations defined here, you can check this tutorial by Joe Politz.
You can also read this great and concise blog post by Matt Might.