ICS 45C Spring 2022, Notes and Examples: Functions and Lambdas

Background

When we first learn a new programming language, but before we've seen anything resembling the complete picture of how the language works, we develop a set of principles that guide how we think about that language. Some of those principles are explicitly stated to us; other times, we discover them on our own (sometimes subconsciously, based on the subset of the language we've seen). And, of course, when we discover these principles on our own, they sometimes turn out to be simplifications of reality, and as we learn more, we discover things that invalidate some of our previous assumptions, then we refine our understanding of the language to take these new details into account.

Depending on what other programming languages you've seen previously, you may or may not have developed an idea that functions are completely distinct from the objects that they operate on — that objects are one kind of thing and functions are something else entirely, and that you can't substitute one for the other. After all, syntactically, they look different:

// Declaring and defining a variable to store an integer object
int i;

// Declaring and defining a function
std::string duplicate(const std::string& s)
{
    return s + s;
}

and they're used differently:

// We access an object stored in a variable by using its name
std::cout << i << std::endl;

// We call a function by following it with parentheses and passing it
// an optional set of parameters
std::cout << duplicate("Boo") << std::endl;

As we've seen, though, C++ uses a looser definition of the word object than you might recognize from Python or Java. As far as C++ is concerned, everything is an object, from simple types like int and bool to complex ones like std::map<std::string, unsigned int>.

So where do functions fit into that mix? Are they objects? The answer is murkier than you might realize until you stop and think about it a bit. This example is a brief exploration of the topic.

Function objects

In C++, an object is called a function object if it can be called like a function (i.e., its name can be followed by parentheses surrounding a list of arguments separated by commas). The functions we write, of course, can be treated this way, so we can treat functions as function objects.

Suppose we have this simple C++ function:

int square(int n)
{
    return n * n;
}

If that function has been declared previously, we can, of course, call it by writing the expression square(3). Since we passed the function an int argument and its signature says it returns an int, the type of the expression square(3) is int.

But what happens if we write the expression square, without writing parentheses and parameters after it? This turns out to be legal in C++, and the result is a function object — technically, a pointer to the function (which you can think of as the address where that function's compiled code begins), though this turns out to be less important in C++11 than it was in older versions of C++.

Of course, if there are function objects, there must also be a way to do the same kinds of things to them that we expect to be able to do to other objects, such as these:

Store them in variables
Pass them as arguments to other functions
Return them as the result of other functions
Store them in member variables of a class and initialize them in a constructor

In order to be able to do those things, though, we need a way to describe their type. For that purpose, C++ includes a type in its standard library called std::function. (There are also ways in C++, inherited from its C lineage, to declare function pointer types, though they're less useful now that C++ includes things like std::function, which are not only simpler, but also cover a fair number of scenarios that function pointers don't.)

The std::function type

std::function comprises the set of types that describe function objects in C++. Unlike most of the types we've seen to date, though, the name std::function isn't enough; if you want to use a function, you need to know not only that it's a function, but also what kinds of parameters it accepts and what kind of result it returns. We wouldn't reasonably expect a single variable to be able to store either the square function or the duplicate function we wrote above, because these functions are fundamentally incompatible with each other — they accept different kinds of parameters and return a different kind of result.

For this reason, std::function is what is called a template type, meaning that the type itself takes parameters that refine its meaning. When different parameters are passed to the std::function template type, the types described are considered different. (We'll see this same technique show up elsewhere in the C++ Standard Library, and we'll even learn how to write our own templates later this quarter.)

The type parameter passed to std::function is a description of the function's desired signature: a return type and the types of its parameters. Syntactically, it looks very much like a function signature, except with the name of the function and the names of the parameters left out. For example, a std::function type suitable for storing our square function above would look like this: std::function<int(int)>. Given a variable of type std::function<int(int)>, we could store any function in it, as long as that function had a compatible signature (i.e., it could be called with an int argument and return an int result).

int square(int n)
{
    return n * n;
}

// This is how you assign a function into a std::function variable.  Note
// that there are no parentheses and parameters after "square".  This is
// intentional, because the goal here is to assign the function itself into
// "f", not the result of *calling* the function into "f".
std::function<int(int)> f = square;

// This is how we would call the function object held by a std::function
// variable.  Note that the syntax is identical to how you would call any
// other function!  A function object can be treated like a function.
std::cout << f(3) << std::endl;

So what can you do with a std::function<int(int)> variable? Lots of things! For example:

Copy its value into another std::function variable of the same type (i.e., with the same signature)
Call the function it stores (by following its name with parentheses and arguments) and get back a result
Pass it as an argument to some other function
Return it as a result from another function
Store it in a member variable of an object of a class

For example, consider the following function:

void transform(int* a, unsigned int size, std::function<int(int)> f)
{
    for (unsigned int i = 0; i < size; i++)
    {
        a[i] = f(a[i]);
    }
}

This function takes an array of integers (and its size), along with a function that takes an integer and returns an integer. It then iterates through the array and replaces each value with the result of calling the function with that value. So, for example, given our square function above, we could do something like this:

int a[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

// Square all the elements of a
transform(a, 10, square);

And, of course, if we had other functions that take an integer argument and return an integer result, we could use them with our transform function, as well.

int cube(int n)
{
    return n * n * n;
}

// Cube all the elements of a
transform(a, 10, cube);

This makes transform an amazingly powerful function. Instead of being a function that does only a particular thing to every element, it's instead a more abstract function that can do any particular thing to every element, and you get to tell it what that particular thing is each time you call it!

Lambda expressions

Languages that allow you to treat functions this way — that consider functions to be data, in the sense that you can pass them as arguments, store them in variables, return them as results, etc. — are said to (at least partially) support functional programming. (This is only part of what is technically called "functional programming," but a useful part, even in a language that doesn't support the rest.)

In languages that support functional programming, it's quite common to be able to create new functions without having to give them names — i.e., to be able to write function literals the way we can write integer literals like 3 or string literals like "Boo". Expressions that are used to build functions without naming them are quite often called lambda expressions (so named because they originally come from a branch of mathematics called lambda calculus, which studies the behavior of functions).

C++ supports lambda expressions, though the syntax is a bit cumbersome. It helps to start with an example, so here's an example:

transform(a, 10, [](int i) { return i + i; });

A pair of brackets at the beginning of an expression indicates that the following will be a lambda expression (i.e., a function literal). This is followed by parameters listed in parentheses, which is, in turn, followed by a body for the function. This particular lambda expression builds a function that takes an integer and returns the result of adding that integer to itself (i.e., doubling it).

There are a couple of things to note here:

The return type does not need to be specified, since C++ is able to determine its type based on the body of the function. In this case, for example, since the body of the function adds i to itself and then returns the result, C++ can determine definitively that the return type is int. (There are some more complex cases where the return type can't be determined, and C++ provides a syntax for specifying it in those cases where you need to.)
The brackets actually serve as more than just a way for the compiler to detect the beginning of a lambda expression. Additionally, you can specify how variables from the lambda's surrounding scope should be treated. For example, in the example below, the = says "Make a copy of any variables from outside the function that are used in the function." Not surprisingly, you can also use an &, which means "Variables from outside the function that are used in the function should be treated as references to the variables outside the function."
```
int x = 3;
transform(a, 10, [=](int i) { return i + x; });
```
You might think, at first, that this distinction isn't important, but there are at least a couple of reasons you might need to have control over it:
- The act of copying something can be expensive. In the case above, it's not something we'd be concerned about, since ints are cheap to copy, anyway; but there are plenty of cases where the cost of copying an object isn't one you're going to want to bear.
- Particularly if you aren't going to be calling the function until much later — most notably, after the surrounding scope has already been destroyed — then accessing variables from the surrounding scope by reference is potentially dangerous, because those variables might already be dead by the time the function is called.

How member functions are different from other functions

We've talked before about how member functions in classes are different from other functions. For example, consider this class:

class Person
{
public:
    ...
    void setFirstName(const std::string& newFirstName);
    ...
};

It's important to note that the type of setFirstName is not std::function<void(const std::string&)>, because setFirstName actually takes two parameters:

The implicit this parameter that is the first parameter of every member function. (In this case, the type of this would be Person*, a pointer to the Person object on which the function is being called.
The string parameter newFirstName.

However, it does turn out that setFirstName has a type: it's std::function<void(Person*, const std::string&)>.

For a member function that's const, such as this one:

class Person
{
public:
    ...
    std::string getFirstName() const;
    ...
};

the const would affect the type of that this parameter, so getFirstName would have the type std::function<std::string(const Person*)>.