14

To Grok a Mockingbird

 5 years ago
source link: https://www.tuicool.com/articles/hit/eiINneA
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Using recursive combinators to enhance functional composition

In this essay we’re going to look at the mockingbird , also called the M combinator.

The mockingbird is one of the recursive combinators , a combinator that takes a function that is not recursive, and returns a function that is recursive. We’ll see how it works, and one unusual but interesting application for it. And when we’re done, we’ll have an appreciation for how combinators can be used to make functions more composeable.

6nIZnan.jpg!web

a recursive function

As the number of people discussing recursion in an online forum increases, the probability that someone will quote the definition for recursion as Recursion: see ‘recursion’ , approaches one.

Recursive functions are easy to grasp, especially if they are simple. We’re going to construct one that computes the exponent of a number. If we want to compute something like 2^8 (two to the power of eight), we can compute it like this: 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 . That requires seven multiplications. So, any time we want to raise some number x to the exponent n , the naïve method requires n-1 multiplications.

2 * 2 * 2 * 2 * 2 * 2 * 2 * 2
  //=> 256

function naiveExponent (x, n) {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else {
    return x * naiveExponent(x, n - 1);
  }
}

naiveExponent(2, 8)
  //=> 256

Obviously, we can implement this more efficiently with iteration. It’s so easy to convert this by hand that we won’t show it here.

Now let’s make an observation: Given a list of numbers to multiply, instead of performing each multiplication independently, let’s Divide and Conquer . Let’s take the easy case: Can we agree that 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 is equal to (2 * 2 * 2 * 2) * (2 * 2 * 2 * 2) ? That seems like the same number of operations (there are still seven * s), but if we write it like this, we save three operations:

const square = x => x * x;

square(2 * 2 * 2 * 2)
  //=> 256

Now we perform three multiplications to compute 2 * 2 * 2 * 2 , and one more to square it, producing 256 . We can use the same reasoning again: 2 * 2 * 2 * 2 is equivalent to (2 * 2) * (2 * 2) , or square(2 * 2) . Which leads us to:

const square = x => x * x;

square(square(2 * 2))
  //=> 256

square(square(square(2)))
  //=> 256

Now we’re only performing three multiplications, not seven. We can write a version of this that works with any exponent that is a power of two:

function exponent (x, n) {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else {
    return exponent(x * x, n / 2);
  }
}

exponent(2, 8)
  //=> 256

Handling exponents that aren’t neat powers of two involves checking whether the supplied exponent is even or odd:

function exponent (x, n) {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * exponent(x * x, Math.floor(n / 2));
  } else {
    return exponent(x * x, n / 2);
  }
}

exponent(2, 7)
  //=> 128

So far, so good!

iauqmeN.jpg!web

recursion and binding

Question: How does our exponent function actually perform recursion? The immediate answer is, “It calls itself when the work to be performed is not the base case” (the base case for exponentiation is an exponent of 0 or 1 ).

How does it call itself? Well, when we have a function declaration (like above), or a named function expression, the function is bound to its own name within the body of the function automatically.

So within the body of the exponent function, the function itself is bound to the name exponent , and that’s what it calls. This is obvious to most programmers, and it’s how we nearly always implement recursion.

But it’s not always exactly what we want. Our exponent function is an improvement over naiveExponent , but of we want even more performance, we might consider memoizing the function.

Here’s a memoization decorator, snarfed from Time, Space, and Life As We Know It :

const memoized = (fn, keymaker = JSON.stringify) => {
  const lookupTable = new Map();

  return function (...args) {
    const key = keymaker.call(this, args);

    return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
  }
};

We can make a memoized version of our exponent function:

const mExponent = memoized(exponent);

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 8)
  //=> 256, returns the memoized result without further multiplications

There is a hitch with this solution: Although we are invoking mExponent , internally exponent is invoking itself directly, without memoization. So if we write:

const mExponent = memoized(exponent);

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs four multiplications

When we invoke exponent(2, 8) , we also end up invoking exponent(4, 4) , exponent(16, 2) , and exponent(256, 1) . We want those memoized. That way, when we invoke exponent(2, 9) , and it invoked exponent(4, 4) , the result is memoized and it need do no further computation.

Our problem here is that exponent is “hard-wired” to call exponent , not mExponent . So it never invoked the memoized version of the function.

We can work around that like this:

const mExponent = memoized((x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * mExponent(x * x, Math.floor(n / 2));
  } else {
    return mExponent(x * x, n / 2);
  }
});

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs only one multiplication

In many cases this is fine. But conceptually, writing it this way means that our exponent function needs to know whether it is memoized or not. This runs counter to our “Allongé” style of writing things that can be composed without them needing to know anything about each other.

For example, if we wanted a non-memoized exponentiation function, we’d have to duplicate all of the code, with a minor variation:

const exponent = (x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * exponent(x * x, Math.floor(n / 2));
  } else {
    return exponent(x * x, n / 2);
  }
};

That is not composing things, at all. What we want is to have one exponentiation function, and find a way to use it with or without decoration (such as with or without memoization). And we can do this.

vUZnEjv.jpg!web

composeable recursive functions

The sticking point is that to have full memoization, our exponentiation function needs to have a hard-coded reference to the memoized version of itself, which means it can’t be used without memoization. This is a specific case of a more general problem where things that have hard-coded references to each other become tightly coupled, and are thus difficult to compose in different ways. Only in this case, we’ve made the thing tightly coupled to itself!

So let’s attack the hard-coded reference problem, decoupling our recursive function from itself. Since it doesn’t have to be a named function, we can make it a “fat arrow” expression. If we want a function to have a reference to another function in JavaScript, we can pass it in as a parameter. So the ‘signature’ for our new function expression will look like this:

(myself, x, n) => // ...

In this case, our function assumes that myself is going to be bound to the function itself. Now what about the body of the function? We can change exponent to myself :

(myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * myself(x * x, Math.floor(n / 2));
  } else {
    return myself(x * x, n / 2);
  }
};

One little hitch: Our function signature is (myself, x, n) , but when we invoke myself , we’re only passing in x and n . So we can pass myself in as well:

(myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

Now this seems very contrived, and it doesn’t even work yet. How can we make it work?

y6Nbaiv.jpg!web

the mockingbird

Behold, the JavaScript Mockingbird:

const M = fn => (...args) => fn(fn, ...args);

The Mockingbird is a function that takes another function, and returns a function. That function takes a bunch or arguments, and invoked the original function with itself and the arguments.So now we can write:

M((myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
})(2, 8)
  //=> 256

That is all very well and good, but we’ve added some extra bookkeeping. Do we have any wins? Let’s try composing it with the memoization function. Although we didn’t use it above, our memoize function does allow us to customize the function used to create a key. Here’s a key making function that deliberately ignores the first argument:

const ignoreFirst = ([_, ...values]) => JSON.stringify(values);

And now we can create a memoized version of our anonymous function. First, here it is step-by-step:

const exp = (myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

M(memoized(exp, ignoreFirst))(2, 8)
  //=> 256

But now for the big question: Does it memoize everything? Let’s test it:

const mExponent = M(memoized(exp, ignoreFirst));

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs only one multiplication

We’re back where we were when we wrote:

const mExponent = memoized((x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * mExponent(x * x, Math.floor(n / 2));
  } else {
    return mExponent(x * x, n / 2);
  }
});

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs only one multiplication

But now, our function need have absolutely NO reference to the name of our memoized function.It doesn’t know whether it’s memoized or not. We can make that crystal clear by getting rid of almost every constant and representing the entire thing as an expression. First, given:

const M = fn => (...args) => fn(fn, ...args);

const memoized = (fn, keymaker = JSON.stringify) => {
  const lookupTable = new Map();

  return function (...args) {
    const key = keymaker.call(this, args);

    return lookupTable[key] || (lookupTable[key] = fn.apply(this, args));
  }
};

const ignoreFirst = ([first, ...butFirst]) => JSON.stringify(butFirst);

We can write:

const mExponent =
  M(
    memoized(
      (myself, x, n) => {
        if (n === 0) {
          return 1;
        } else if (n === 1) {
          return x;
        } else if (n % 2 === 1) {
          return x * myself(myself, x * x, Math.floor(n / 2));
        } else {
          return myself(myself, x * x, n / 2);
        }
      },
      ignoreFirst
    )
  );

mExponent(2, 8)
  //=> 256, performs three multiplications
mExponent(2, 9)
  //=> 512, performs only one multiplication

Nothing within our expression refers to mExponent , and we’ve separated three different concerns. Self-invocation is handled by M , memoization is handled by memoized + ignoreFirst , and exponentiation is handled by an anonymous function.

Because we’ve separated them like this, we can compose our function with memoization or not as we see fit. As we saw above, the name binding way was that if we wanted one version memoized and one not, we’d have to write two nearly identical versions of the same code:

const mExponent = memoized((x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * mExponent(x * x, Math.floor(n / 2));
  } else {
    return mExponent(x * x, n / 2);
  }
});

const exponent = (x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * exponent(x * x, Math.floor(n / 2));
  } else {
    return exponent(x * x, n / 2);
  }
};

But with the Mockingbird separating how a function calls itself from the function, we can now write:

const exp = (myself, x, n) => {
  if (n === 0) {
    return 1;
  } else if (n === 1) {
    return x;
  } else if (n % 2 === 1) {
    return x * myself(myself, x * x, Math.floor(n / 2));
  } else {
    return myself(myself, x * x, n / 2);
  }
};

const mExponent = M(memoized(exp, ignoreFirst));
const exponent = M(exp);

We have our composeability and reuse! We could equally insert decorators that log each time our function is called and its arguments, or even swap M out for atrampoline implementation to be used with tail-recursive functions.

UvE3I3U.jpg!web

summary

In summary, the Mockingbird is a recursive combinator : It takes a function that is not directly recursive, and makes it recursive by passing the subject function to itself as a parameter. This has the effect of removing a hard-coded dependency between the subject function and itself, which allows us to decorate it with functionality like memoization.

The Mockingbird is another tool in our “composeable functions” toolbox, increasing reuse by decoupling recursive functions from themselves.

Notes


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK