Tuples In this lecture we will talk about regular and named tuples. The former is an immutable version of lists; the later will be used briefly but superseeded by classes we will learn how do define by then end of the quarter (and you will use extensively in CSI-32/33). We will also learn some odd & ends along the way: the .split/.join methods, comprehensions, the meaning of *args as a parameter, and tuple assignments. ------------------------------------------------------------------------------ First let's look at a topic that is strictly about lists and strings: the .split/.join methods, which intercovert strings and lists in an interesting way. First lets look at the headers of these methods, which are both declared in the str class. str.split(long : str, splitter : str) -> [str] str.join (sep : str, items : [str]) -> str Note that we will use the notation [str] to mean a list where every element is a string. Since str is a reference to an object representing the str class, then [str] really is a list we can write, containing this one value). Because these are methods, we can all them two ways. First .split str.split('1 2 3 4',' ') but more likely '1 2 3 4'.split(' ') both produce the list ['1', '2', '3', '4']. Of cours if the values in the first string were separated by commas (or anything else) we could use commas to split the values: '1,2,3,4'.split(' ') still produces ['1', '2', '3', '4']. In the comprehension section we will learn a simple way to create the list [1, 2, 3, 4] from this one: list([int(i) for i in '1 2 3 4'.split(' ')]) Now .join, which we can also call two ways str.join(';',['1', '2', '3', '4']) but more likely ';'.join(['1', '2', '3', '4']). both produce the string '1;2;3;4'. In the comprehension section we will learn a simple way to create ththis list from [1, 2, 3, 4]: ';'.join([str(i) for i in [1, 2, 3, 4]]). Practice using these methods. They are incredibly versatile, especially once we know about comprehensions, which make it easy to interconvert betwee a list-of-object and a list-of-str (and produce many other interesting lists). tuples ------------------------------------------------------------------------------ Tuples: A tuple is an immutable list within () instead of []: that pretty much says everything you need to know about tuples, except why do we need an immutable version of a list (which we will see next week when we discuss sets and dictionaries, which require immutable components). So, all the standard things like len, indexing, slicing (both operations still use []), checking containment, catenation, mutliplication, and iterability (items 1-7 in the list lecture) work as you expect with tuple. I will mention a few other needed facts: the first shows how to interconvert between lists and tuple with the same values. tuple([1, 2, 3]) is (1, 2, 3) list ((1, 2, 3)) is [1, 2, 3] The second is that (1) is not a tuple: it is an expression with 1 in parentheses. To write a singleton tuple (one with a single value: its len is 1) we write (1,) and of course when Python prints a singleton tuple it prints the value single in parentheses, followed by a comma. The third is that besides creating lists of lists and tuples of tuples, we can create lists of tuples or tuples of lists: (1, [2, 3], 4) [1, (2, 3), 5] ------------------------------------------------------------------------------ namedtuple To use the namedtuple function we must first import it, typically as follows from contaiers import namedtuple Here is an example of how we use named tuples Point = namedtuple('Point', 'x y') origin = Point(0.,0.) unit = Point(1.,1.) print(origin,unit) print(origin.x,unit.y) print(origin[0],unit[0]) print(origin.z) which prints: Point(x=0.0,y=0.0) Point(x=1.0,y=1.0) 0.0 1.0 0.0 1.0 and then raises an exception: AttributeError: 'Point' object has no attribute 'z' meaning that when trying to print origin.z, this object (constructed from the 'Point' class, has no such attribute. origina[2] would raise a similar exception: IndexError: tuple index out of range Also, if we try to write origin.x = 1.0 or origin[0] = 1.0 Python will raise an exception: AttributeError: can't set attribute, because we are dealing with a tuple which is immutable. But, we can call a function that returns a new namedtuple with certain values substituted for others. p1 = origin._replace(x=2) p2 = unit._replace(x=2,y=5) print(p1,p2,origin,unit) prints: Point(x=2.0,y=0.0) Point(x=2.0,y=5.0) Point(x=0.0,y=0.0) Point(x=1.0,y=1.0) So, the namedtuple function takes two arguments (the name for the entire tuple and the names for the fields/indexes in the tuple) and returns a "factory": some object that we can use to construct namedtuples: by using the first name (here Point) and supplying values for all the fields (here two: x and y). The namedtuples created will print as shown, and we can index them by ints or names. With ._replace we can create new namedtuples that are variants of existing ones. As another example, we could write Student = namedtuple('Student', 'name id year gpa') a_student = Student('Anteater, Peter', 123456789, 2, 3.5) print(a_student) Prints as: Student(name='Anteater, Peter', id=123456789, year=2, gpa=3.5) Even more intersting, we can create a list of students to represent a class (or a huge list of students to represent a school). Given such a list, we can compute the average gpa for all the students in it as follows (using a simple composition of ideas about lists and named tuples). If we had many lists for many courses/schools, we could use this function to see which courses/schools exhibited grade inflation. def gpa_average(students : [Student]) -> float: gpa_sum = 0 for s in students: gpa_sum += s.gpa # or gpa_sum += s[3], which is much more cryptic return gpa_sum/len(students) Notice that if we change the structure of the Student namedtuple to Student = namedtuple('Student', 'last_name first_ame id year gpa') the code above (using s.gpa) would work; but the code using s[3] would now be computing the average year in school, which is now the information at index 3. Finally, we can layer lists and namedtuple to build quite complicated data structures. Student = namedtuple('Student', 'name id year gpa tests') Class = namedtuple('Class', 'name number meeting faculty students') School = namedtuple('School', 'name address year_started classes') With these (and lists) we can create a list of Schools (for the US), such that each School would have list of classes, and each class would have a list of students, and each student would have a list of tests. ------------------------------------------------------------------------------ Comprehensions We can build lists/tuples via comprehensions as l = list([comprehension]) or t = tuple((comprehension)) The form of a comprehension is as follows (bool-expression-i is a boolean expression that can refer to i). Note that the [] mean EBNF option. expression-i for i in iterable [if bool-expression-i] So, to create a list of the square from 0 to 10 we could write l = list([i**2 for i in irange(0,10)]) If we wanted only the even squares, we could include the option and write: l = list([i**2 for i in irange(0,10) if i%2 == 0]) Generally, we can translate a list comprehension as follows. comprehension = [] for i in iterable: if bool_expression-i: compreshension.append(expresson-i) This code won't work for a tuple comprehension, because calling .append mutates comprehension (which we cannot do to tuple; this operation is not available). So, Python does some magic to make this work in the case of tuples. ------------------------------------------------------------------------------ Final tuple information Most of the list functions we wrote work identically for tuples (but not the onese that mutate the list/tuple). We can return a tuple instead of a list (as we did in list_min_max). If we specify *args as the name of a parameter, it means to put all the remaining non-named arguments into a tuple that is bound to args (yes, the * prefixes the name and means something special). So, as we saw we can write def min(*args): def max(*args): answer = args[0] answer = args[0] for x in args[1:]: for x in args[1:]: if x < answer: if x > answer: answer = x answer = x return answer return answer If we write min(1, 3, 5) then args is bound to (1, 3, 5); both functions iterate over args to find an extremal value. Finally we can write x, y = 0, 1 to assign a tuple of values to a tuple of arguments. We can write min,max = list_min_max([6, 3, 6, 2, 8, 7, 1]) Look at this function's definition: it returns a list of values (which for this purpose is the same as a tuple of vaues: technically it returns something it can iterate over to retrieve values), the first of which is stored in min and the second of which is stored in max. This is called "unpacking" the values on the left to bind to the names on the right note x = list_min_max([6, 3, 6, 2, 8, 7, 1]) and x,y,z = list_min_max([6, 3, 6, 2, 8, 7, 1]) both raise an exception when executed because the number of values on the right is not the same as the number of names on the left to bind to these values. The exception will print like: ValueError: need more than 2 values to unpack (still need to expand this section a bit)