Arthur Nunes-Harwitt's Cool Compiler Technique

"Eager Evaluation Isn’t Eager Enough, A Transformation Based Approach to Semantics-Directed Code Generation" by Arthur Nunes-Harwitt, Rochester Institute of Technology

Let's look at this technique for changing a function into a compiler for that function. We'll use a version of the power function:

def power(n, b):
    'Return b raised to the nth power.'
    p = 1
    while n:
        if 1 & n:
            p *= b
        n >>= 1
        b *= b
    return p

By decomposing the number n into its binary representation and using it as a list of Boolean values to determine when to accumulate the running value of b into our eventual result p we get the correct number of multiplications of b by itself. (Each iteration squares b which doubles the coefficient: b¹, b², b⁴, b⁸, b¹⁶, ...)

def power(n, b):
    return power_(n, 1, b)

def power_(n, p, b):
    if n == 0:
        return p
    if 1 & n:
        pp = p * b
    else:
        pp = p
    return power_(n >> 1, pp, b * b)

(The power() function just sets the initial value of p for a recursive power_() function.)

Curry

The first step is to curry the function. The p and b parameters are "dynamic" and n is "static"

def power(n, b):
    return power_(n)(1)(b)

def power_(n):
    def pw_p(p):
        def pw_b(b):
            if n == 0:
                return p
            if 1 & n:
                pp = p * b
            else:
                pp = p
            return power_(n >> 1)(pp)(b * b)
        return pw_b
    return pw_p

Lambda lowering

Next we pull out the conditional based on n (we also switch from def statements to lambda expressions, but this is orthogonal to the main point, it merely makes for less inelegant code. The power() function does not change.):

def power_(n):
    if n == 0:
        fn = lambda p: lambda b: p
    elif 1 & n:
        fn = lambda p: lambda b: power_(n >> 1)(p * b)(b * b)
    else:
        fn = lambda p: lambda b: power_(n >> 1)(p)(b * b)
    return fn

Expression lifting

Next the expression based on n is lifted out of the lambdas. It is a recursive call to the power_() function (which returns a function.)

def power_(n):
    if n == 0:
        return lambda p: lambda b: p

    f = power_(n >> 1)

    if 1 & n:
        fn = lambda p: lambda b: f(p * b)(b * b)
    else:
        fn = lambda p: lambda b: f(p)(b * b)
    return fn

Code via Quoting

Now all that remains is to change the return values to quoted forms of the output functions, and dequote the output of recursive calls into those quoted forms. (Note the change of function name. This is no longer the power_() function, it is now a tiny compiler of power() functions, as it emits the Python source code of such functions when called.)

def compile_power(n):
    if n == 0:
        return 'lambda p: lambda b: p'

    f = compile_power(n >> 1)

    if 1 & n:
        fn = f'lambda p: lambda b: ({f})(p * b)(b * b)'
    else:
        fn = f'lambda p: lambda b: ({f})(p)(b * b)'

    return fn

source = compile_power(5)

lambda p: lambda b: (lambda p: lambda b: (lambda p: lambda b: (lambda p: lambda b: p)(p * b)(b * b))(p)(b * b))(p * b)(b * b)

I find it wild that the Python interpreter is sophisticated enough to correctly track the various uses of 'p' and 'b'. Let's add some integers to the parameter names to help our poor soggy brains keep track as well (we will use n, why not?):

def compile_power(n):
    if n == 0:
        return 'lambda p: lambda b: p'

    f = compile_power(n >> 1)

    if 1 & n:
        fn = f'lambda p{n}: lambda b{n}: ({f})(p{n} * b{n})(b{n} * b{n})'
    else:
        fn = f'lambda p{n}: lambda b{n}: ({f})(p{n})(b{n} * b{n})'

    return fn

lambda p5: lambda b5: (lambda p2: lambda b2: (lambda p1: lambda b1: (lambda p: lambda b: p)(p1 * b1)(b1 * b1))(p2)(b2 * b2))(p5 * b5)(b5 * b5)

source = compile_power(5)
pow5 = eval(source)(1)

Note the initialization of p to 1. That only has to be done once when the function is first used, and it's a separate stage than compiling the source code.

You get the source code for a power-to-the-nth function, then "compile" that source with the Python eval() function to get a function of the form: (lambda pN: lambda bN: ...), and then you call that function with p = 1 to initialize it and get a function of the form: (lambda bN: ...) which is our actual "to-the-nth" function.

Let's reformat that a little to maybe make it a little more clear what's going on:

lambda p5: lambda b5: (
lambda p2: lambda b2: (
lambda p1: lambda b1: (
lambda p:  lambda b:  p
)(p1 * b1)(b1 * b1)
)(p2)     (b2 * b2)
)(p5 * b5)(b5 * b5)

It becomes clear that b is squared on each iteration but p is multiplied by b only on the first and third iterations (as expected by the binary form of 5: 101) (the final, fourth iteration just returns p.) So we get b¹ * b⁴ = b⁵.

(I don't like this function though because as nice as it is it always does one extra multiplication (of b * b) and then throws it away at the end.)