Functional Programming Functional programming is a style of programming (some use the words programming paradigm, compared to the more standard procedural and object-oriented styles, which are part of the imperative paradigm: functional programs fundamentally evaluate expressions; imperative programs fundamentally execute statements) that uses the simplicity and power of functions to accomplish programming tasks. In a purely functional solution to a problem, there will be no mutation to data structures, and recursion (not looping) will be the primary control structure. A certain class of functions, called tail recursive, can be translated into non-recursive functions that run faster; most real functional implement tail- recursive functions this way (but Python does not). Such languages easily support and frequently use the passing of functions as arguments to other functions (these other functions are called "higher-order" functions or functionals) and functions returning other functions as their values; we have seen both of these features used in Python. There are programming languqages that are purely function (Haskell), others that are mostly functional (ML -whose major implementations are SML and OCaml- the Scheme dialect of Lisp, and Erlang), and still others that at least support a functional style of programming (some better, some worse) when it is useful. Python is in this latter category, although features like comprehensions in Python emphasize its functional programming aspects (generators and even lambdas fall into this category too). Generally, functional programming is characterized as using immutable objects and no state changes (not even rebinding of names). Strings, tuples and frozensets are all immutable in Python (which means we can use their values as keys in dicts and values in sets), but lists, sets, and dicts are not. In functional programming, we don't change data structures but produce new ones that are variants of the old ones. For example, if we want to "change" the first value of a tuple t to 1, we instead create a new tuple whose first values is 1 and whose other values are the other values in the tuple, using the concatenation operator. The new tuple is specified as (1,)+t[1:]; note that we need the comma in (1,) to distinguish it from (1): the former is a tuple containing the value 1, the later is an int. Functional programming creates lots of objects and must do so in a time and space efficient way, and for the most part, functional languages achieve parity in time/space efficiency with non-functional programming languages. Although, mixed language like Python tend not to do as well when used functionally as true functional languages. Emerging languages like Scala and Clojure are closing the gap. Also, because of the simplicity of the semantics of functional programming, it is easier to automatically transform functional programs to run efficiently on parallel, cluster, or multi-core computers. Function programming languages are also statically type-safe. Before programs are executed, the system executing them can be guarantee that all operators and functions are applied to the correct number and type of arguments, or report problems where they detect errors. Functional programming languages are still not as widely used as imperative languages, but they continue to find many uses in industry, and in some industries (telecommunications) they have achieved dominance (at least with some companies within the industries). Programmers who are trained to use functional languages think about problems and solve problems differently. All CS students should be exposed to functional programming as part of their education (and I mean an exposure longer than one day). To learn more about Python's use of functional programming, read section 10 (Functional Programming Modules) in Python's Library documentation, discussing the itertools, functools, and the operator modules. ------------------------------------------------------------------------------ In this lecture we will look at just three important higher-order functions used in functional programming: map (transform), filter, and reduce (accumulate). Each operates on a function and an iterable, which means they operate on lists and tuples easily, but also on iterables that don't store all their values and just produce values as necessary (e.g., the ints and primes generators). We will write recursive and generator versions of each, with the recursive versions having list parameters and returning lists, because many functional programing languages use only lists, not tuples, but of course lists are immutable in these langauges. (1) map/transform: this function takes a unary function and some list/iterable of values and produces a list/iterable of mapped/transformed values based on substituting each value with the result of calling the parameter function on it. For example, calling map_l(lambda x : x**2, [i for i in irange(0,5)]) produces a list of the squares of the values of the numbers 0 to 5: [0,1,4,9,16,25]. Calling map_i(lambda x : x**2, irange(0,5)) produces an iterable of the squares of the values of the numbers 0 to 5. If we wrote the list comprehension [i for in in map_i(lambda x : x**2, irange(0,5))] Python would return the same result as for map_l: [0,1,4,9,16,25] Note that lambdas are frequently (but not exclusively) used in calls to the map function. Here are simple implementations of the list/iterator versions of this map def map_l(f,alist): if alist == []: return [] else: return [f(alist[0])] + map_l(f,alist[1:]) def map_i(f,iterable): for i in iterable: yield f(i) Note that Python defines a map function that really produces a generator (so it is closer to map_i than map_l): y = map(lambda x : x**2, [i for i in irange(0,5)]) print(y) prints We can use a comprehenstion to turn such an object into an actual list. print([i for i in y]) prints [0,1,4,9,16,25] In fact, Python generalizes map to work on any number of lists/iterables. If there are n iterables, then f must have n parameters. So, if we called the real map function in Python (which as we've seen, produces an iterable) as print ( [i for i in map(lambda x,y: x+y, irange(0,5), irange(100,105))] ) prints [100, 102, 104, 106, 108, 110] (2) filter: this function takes a predicate (a unary function returning a bool, although in Python most values have a bool interpretation: see __bool__) and some list/iterable of values and produces a list/iterable with only those value for which the predicate returns True (or a value that is interpreted as True). For example, calling filter_l(predicate.is_prime, [i for i in irange(2,50)]) produces a list of the values between 2 and 50 that are prime: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]. Here are simple implementations of the list/iterator versions of this function def filter_l(p,alist): if alist == []: return [] else: if p(alist[0]): return [alist[0]] + filter_l(p,alist[1:]) else: return filter_l(p,alist[1:]) which we can simplify this code a bit, by using a conditional expression and noting that [] + alist = alist def filter_l(p,alist): if alist == []: return [] else: return ([alist[0]] if p(alist[0]) else []) + filter_l(p,alist[1:]) def filter_i(p,iterable): for i in iterable: if p(i): yield i Note that Python defines a filter function that produces a generator when called. It prints like . We can use a comprehension to turn such an object into an actual list. (3) reduce/accumulate: this function is different than the previous two: it takes a binary function and some list/iterable of values and typically produces a single value: it reduces or accumulates these results into a single value. Unlike map and filter (which are defined and automatically imported from the builtins module) we must import reduce from the functools module explicitly. For example, calling reduce(lambda x,y : x+y, irange(1,100)]) returns the sum of all the values in the irange iterable. Here is a more interesting call, because uses a non-commutative operator (subtract). reduce(lambda x,y : x-y, [1,2,3]) which returns -4: or 1 - 2 - 3 or (1-2)-3. Technically, this is called LEFT reduction/accumulation because the operators are applied left to right. If they had been applied right to left (right reduction), the result would have been 1-(2-3) = 1 - (-1) = 2. For all commutative operators, the association order doesn't make a difference. That is, (1+2)+3 is the same as 1+(2+3). So for 5 values, the reduce is equivalent to (((1+2)+3)+4)+5. Note that the operator module defines a variety of functions like add (which has the same meaning as lambda x,y: x+y) so we could also call this function as reduce(operator.add, irange(1,100)) if we had imported operators. Here is another interesting example reduce(max, [4,2,-3,8,6]) which is equivalent to max(max(max(max(4,2),-3),8),6) which evaluates as follows, to compute the maximum of the entire list of values. max(max(max(max(4,2),-3),8),6) -> max(max(max(4,-3),8),6) -> max(max(4,8),6) -> max(8,6) -> 8 Here is the simplest implementation of reduce that I can think of. The unit is returned if the iterable has no values; the first value in the iterator is returned if the iterator has one value, otherwise f is applied to all the operands as shown above to compute the reduced/accumulated value. def reduce(f,iterable,unit=None): i = iter(iterable) try: a = next(i) while True: try: a = f(a,next(i)) except StopIteration: return a except StopIteration: return unit There is only one verion of this function, because it produces a single answer, so we don't need list/iterable versions. There is a simpler recursive definition of this functions, but it uses RIGHT reduction/accumulation, which for commutative operators produces the same result. def reduce(f,alist,unit=None): if alist = []: return unit else: return f(alist[0],reduce(f,alist[1:],unit)) Here is a concrete example of a function style of programming. This expression computes the length of the longest line in a file. reduce(max, map(lambda l : len(l.rstrip()), open('file'))) To return the longest line, not the length of the longest line, we could compute as follows. Here the lambda for reduce (whose arguments will be two lines from the file) returns the longer of the two lines; when reduced over all lines in the file, the final result is the largest line in the file. The lambda to map now strips spaces off the right end, but doesn't map lines to their lengths. reduce(lambda x,y: x if len(x) >= len(y) else y, map(lambda l : l.rstrip(), open('file'))) Functional programmers spend a lot of time using these tools to build up their idioms of expressions. We are just peeking at this topic now. It is possible for reduce to return all type of results, not just simple ones: there are for example, wasy to reduce lists of lists to produce just lists. ------------------------------------------------------------------------------ MapReduce, commutative functions, and parallel processing MapReduce is a special implemention of the map/reduce functions implemented to run in parallel, cluster, or multi-core computers. If we can write our code within the MapReduce language, we can guarantee that it runs quickly on the kinds of computers Google uses for its servers. Generally what it does is run similar operations on huge amounts of data, combining results, until we get a single answer. Apache Hadoop is open source version of MapReduce (but to really see its power, we need a cluster of computer to run our code on). How does MapReduce work? The story is long, but imagine we have a commutative operator and want to compute: 1 + 2 + 3 + ... + n We can evaluate this expression as shown above, which would require n-1 additions one right after the other (the former must finish before the later starts). Even if we had multiples cores, doing the operations in this order would require n-1 sequential additions so only one core at a time would be active. 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 | | | | +-+-+ | | | | | 3 | | | | | +--+--+ | | | 6 | | | +------+ | 10 .... note that one more operand is used at each level Here each level uses 1 core and there are 15 levels. In general, with N numbers to add it take N-1 time steps. Now, how MapReduce can handle this problem? Instead, because of commutivity, we can evaluate this expression in a different way: add the 1st and 2nd values at the same time as we add the 3rd and 4th at the same time as the 5th and 6th ... In all, we can add n/2 pairs simultaneously (if we had n/2 cores). We can use this same trick for the remaining n/2 sums, simultaneously adding them together; then for the n/4 sums, ..., to the final sum sums (for which only one processor is necessary). Here is a pictorial representation of this process for 16 values. 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13 + 14 + 15 + 16 | | | | | | | | | | | | | | | | +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ +-+-+ 8 cores | | | | | | | | 3 7 11 15 19 23 27 31 | | | | | | | | +---+---+ +---+---+ +----+----+ +----+----+ 4 cores | | | | 10 26 42 58 | | | | +-------+-------+ +---------+---------+ 2 cores | | 36 100 | | +----------------+-----------------+ 1 core | 136 Here each level uses as many cores as possible there are 4 levels. In general, with N numbers to add it takes Log2 N times steps. Recall that Log2 1,000 is 10, Log2 1,000,000 is 20, and Log2 1,000,000,000 = 30. To perform 10**9 additions in 30 time steps, we'd need a half billion cores: not likely this is coming in your lifetime. But if we had tens-or-hundreds of cores, we could keep them all busy except for the last few (bottom) levels. So we could get our code to run tens-or-hundreds of times faster.