Memoize All The Things!

· 1511 words · 8 minute read

The Prompt 🔗

From Bartosz Maliszewski’s “Category Theory For Programmers”:

2.7 Challenges:

Define a higher-order function (or a function object) memoize in your favorite language. This function takes a pure function f as an argument and returns a function that behaves almost exactly like f, except that it only calls the original function once for every argument, stores the result internally, and subsequently returns this stored result every time it’s called withh the same argument. You can tell the memoized function from the original by watching its performance. For instance, try to memoize a function that takes a long time to evaluate. You’ll have to wait for the result the first time you call it, but on subsequent calls, with the same argument, you should get the result immediately.

As we can see, it basically wants us to save the result of each computation and then return the result of that computation (if we already have it saved).

The Implementation 🔗

Well we can already decipher some of the basic parts this function should have.

  • the name should be memoize
  • it should take 1 parameter
    • that parameter being f (the function we want to memoize)
  • it’s return value is going to be a function so we will write auto for now.
    • more specifically, the return type will be a lambda.
auto memoize(auto f)
{
    return [](){ };
}

In this case we want to pair up a value that f is called with, with its output. This is a perfect case for a std::map! Side note: I hate how the C++ standard decided to name std::map for the “ordered” version of a hashmap and std::unordered_map for the “unordered” version of a hashmap. Rarely do you want an “ordered” hashmap, and in this case, we don’t care if the elements are ordered or not. All that to say, we will be using a std::unordered_map instead of a std::map.

This would also be a good time to explain how we will manage the types of all of these functions and memoizing things. We know that the function f will have a “domain” and a “codomain”, in reality, those are the only two types that matter to us. So, in this case, we will simply ask for those two types in the template for this function.

The lambda takes in an x of type Domain and returns something of type Codomain after evaluating f(x). With all of those changes in mind, this is how our function looks like now:

template <typename Domain, typename Codomain>
auto memoize(auto f)
{
    auto memo = std::unordered_map<Domain, Codomain>();
    return [=](Domain x) -> Codomain { return f(x); };
}

Now, inside the lambda, we have to check if our memo object has the result of f(x) saved if not, then we calculate f(x) and save it before returning it.

template <typename Domain, typename Codomain>
auto memoize(auto f)
{
    auto memo = std::unordered_map<Domain, Codomain>();
    return [=](Domain x) -> Codomain {
        if (memo.contains(x)) {
            return memo.at(x);
        } else {
            memo[x] = f(x);
            return memo.at(x);
        }
    };
}

To avoid a bunch of unnecessary copies, we can simply take each value by const& or by moving it. Additionally, we only need 1 memo object per memoize function call. Which means that we need to mark the memo object as static.

template <typename Domain, typename Codomain>
auto memoize(auto&& f)
{
    static auto memo = std::unordered_map<Domain, Codomain>();
    return [=](Domain const& x) -> Codomain {
        if (memo.contains(x)) {
            return memo.at(x);
        } else {
            memo[x] = f(x);
            return memo.at(x);
        }
    };
}

Congratulations!!! You have succesfully made a function that can memoize any function of a single argument!!!

Two or More Arguments 🔗

This begs the question; what if we want to memoize any function of any number of arguments… We know that variadic functions exist in C++, so this should technically be possible.

If we were to convert the argument pack into a tuple, maybe we could shove that tuple into a std::unordered_map and associate any combination of argumenst with a single result. This brings a problem though. std::unordered_map cannot use std::tuple or std::pair as a key or a value. Which means that if we want to use std::unordered_map for this, we will hav to work a little more for it. This implementation would not have been possible if it weren’t for this StackOverflow forum post. Basically, we just need to create a hashing function and pass it as a template parameter to std::unordered_map.

We will have to tackle this series of problems:

  1. What is a parameter pack?
    • and how do we turn it into a tuple?
  2. How do we make a hash like the one in the StackOverflow forum post, using a variadic tuple?
  3. How do we get the types of each entry in the tuple?
    • and would it be possible to remove the return type from the template parameters? It seems like it can be deduced.

Let’s tackle these in order

What is a Parameter Pack 🔗

A parameter pack is quite simply a pack of parameters that were passed to a function. They look like any regular variable inside a function, you can give it any name and pass it to other functions. The only difference is the use of before or after to tell the C++ compiler how to treat the entire pack, usually when passing it to other functions etc.

A parameter pack is usually indicated by something like this:

template <class... Types>

right before your function or class.

How Do We Turn a Parameter Pack into a Tuple? 🔗

First of all, we need to create a hashing function out of this tuple. And we know that we need to pass a parameter pack to it so let’s get those two things out of the way.

template <class... Types>
struct n_tuple_key_hash {};

This hashing function is a function object (more commonly known as a “functor” in C++ (not to be confused with functional programming functors)). For this we have to overload the operator () to take a tuple (with the types of the parameter pack that were passed in) and return the hash for that tuple.

template <class... Types>
struct n_tuple_key_hash {
    auto operator () (std::tuple<Types...> args) const -> std::size_t
    {
        return 0;
    }
};

How Do We Make a Hash like the One in the StackOverflow Forum Post, Using a Variadic Tuple? 🔗

For the actual hashing function we can just return the XOR of all the types that were passed. This is done by using the STL’s std::apply which will apply a variadic function to a tuple. In this case, the variadic function is a lambda that XOR’s the first argument and all the other arguments (denoted by ).

template <class... Types>
struct n_tuple_key_hash {
    auto operator () (std::tuple<Types...> args) const -> std::size_t
    {
        return std::apply([](auto... t) { return (t ^ ...); }, args);
    }
};

How Do We Get the Types of Each Entry in the Tuple? 🔗

This can be done, again, using a parameter pack! Additionally, using our new n_tuple_key_hash function object, this is how our implementation currently looks.

template<class... Types>
auto memoize(auto&& f)
{
    static auto memo = std::unordered_map<std::tuple<Types...>, TODO_RETURN_TYPE, n_tuple_key_hash<Types...>>();

    return [&](Types... types) { };
}

And for the inside of the memoized function we just do:

template<class... Types>
auto memoize(auto&& f)
{
    static auto memo = std::unordered_map<std::tuple<Types...>, TODO_RETURN_TYPE, n_tuple_key_hash<Types...>>();

    return [&](Types... types) {
        auto const& t = std::make_tuple(types...);
        return memo.contains(t) ? memo.at(t) : memo[t] = f(types...);
    };
}

Would it Be Possible to Remove the return Type from the Template Parameters? 🔗

More recently, the ISO has put into the standard the concepts and type_traits library, which lets us do things like finding the return type of a templated function depending on the parameters that get passed to it. For example, the return type of of f depending on our Types… would be: std::invoke_result_t<decltype(f), Types…>. Emphasis on the word type, this template returns a type. Ignoring how cool that in itself is! We can now use this inside the memo object declaration.

template<class... Types>
auto memoize(auto&& f)
{
    static auto memo = std::unordered_map<std::tuple<Types...>, std::invoke_result_t<decltype(f), Types...>, n_tuple_key_hash<Types...>>();

    return [&](Types... types) {
        auto const& t = std::make_tuple(types...);
        return memo.contains(t) ? memo.at(t) : memo[t] = f(types...);
    };
}

Conclusion 🔗

If you want to memoize any function, this is how you do it:

template <class... Types>
struct n_tuple_key_hash {
    auto operator () (std::tuple<Types...> args) const -> std::size_t
    {
        return std::apply([](auto... t) { return (t ^ ...); }, args);
    }
};

template<class... Types>
auto memoize(auto&& f)
{
    static auto memo = std::unordered_map<std::tuple<Types...>, std::invoke_result_t<decltype(f), Types...>, n_tuple_key_hash<Types...>>();

    return [&](Types... types) {
        auto const& t = std::make_tuple(types...);
        return memo.contains(t) ? memo.at(t) : memo[t] = f(types...);
    };
}

My personal preference; adding [[nodiscard]] and noexcept to things that can have them.

template <class... Types>
struct n_tuple_key_hash {
    auto operator () (std::tuple<Types...> const& args) const noexcept -> std::size_t
    {
        return std::apply([](auto... t) { return (t ^ ...); }, args);
    }
};

template<class... Types>
[[nodiscard, gnu::const]] auto memoize(auto const&& f) noexcept
{
    static auto memo = std::unordered_map<std::tuple<Types...>, std::invoke_result_t<decltype(f), Types...>, n_tuple_key_hash<Types...>>();

    return [&](Types... types) noexcept {
        auto const& t = std::make_tuple(types...);
        return memo.contains(t) ? memo.at(t) : memo[t] = f(types...);
    };
}