Inheritance II This short lecture extends our discussion of inheritance from single-inheritance to multiple-inheritance: we will learn how to visualize multiple-inheritance relationships (not as a N-ary tree, but as a more complicated network) and how to generalize the rules Python uses for locating attributes. We can use the inheritancetool.py module, along with any classes forming an inheritance hierarchy, to see how Python locates such attributes. We can use this tool with the module defining the Counter/Modular_Counter classes, and with the inheritancesample.py module, which defines a more complicated hierarchy discussed below. We will also discuss the truth about the isintstance function and how Python matches exceptions in the except clause from the try/except statement. ------------------------------------------------------------------------------ Visualizing Multiple-Inheritance Hierarchies Examine the class structure below, which we have simplified by showing just the class statement with a body of pass. class B1a : pass # object is the base class of the derived class B1 class B1(B1a) : pass # B1a is the base class of the derived class B1 class B2 : pass # object is the base class of derived class B2 class C(B1,B2): pass # B1 and B2 (in that order) are the base classes of C We can visualize the relationship between the derived and base classes using the same building blocks we used for single-inheritance: a derived class refers upward to its direct base class. The three pictures below are all logically equivalent (in terms of derived classes and their arrows to base classes): although the "location" of some of the classes are different in the pictures, in all cases B1 is the first/left base class from which C is derived; B2 is the second/right base class from which C is derived; B1 is derived solely from B1a, and B1a and B2 themselves are each derived from the base class object. object object ^ ^ ^ ^ / \ / \ B1a B2 B1a | B1a --> object ^ ^ ^ | ^ ^ | | | | | | B1 / B1 B2 B1 B2 ^ / ^ ^ ^ ^ \ / \ / \ / C C C Because of the extra complexity of multiple-inheritance, the relationships can form a complex network that cannot be captured by a simple N-ary tree: for example above C has two paths to the root, object, which is disallowed in trees representing only singe inheritance. These more complicated structures mean Python's rules are more complicated for determining in which order classes are searched for attributes. To draw these inheritance networks, we start at the most derived class, and then draw all its base classes (in the order they appear in the class definition, mirroring the left to right ordering), directly above the class(es) derived from them, with arrows leading from the derived class to the base class. For the example above, because C is derived from B1 and B2 (in that order), they appear (in that left to right order) above C; because B1 is derived from B1a, and B1a is derived from object, B1a appears above B1 and object appears above B1a; because B2 is derived from object, object appears above B2 also. These networks can get messy to visualize, with more complicated relationships among the classes, but the picture layout is not what is important: what is important is the individual logical relationships (which derived classes refer to which base classes, and in which order), which are directly used to locate attributes. In complicated inheritance networks, drawing the relationships might require crossing lines. ------------------------------------------------------------------------------ Locating Attributes in Multiple-Inheritance Hierarchies I will now state the principle, in English, that Python uses for locating an attribute in an inheritance network. In the next section I'll show actual Python code that computes the equivalent information. Once this "equivalent information" is computed, it is stored with the class, so locating attributes becomes trivial because the classes are searched in the stored order. The search must take into account two fundamental principles: (1) Before searching a base class, all classes derived from it must be searched (2) Base classes must be searched in the order they appear in each derived class definition. The derived class specifies them in a left to right order, with a left base class searched before a right base class. So, (a) Python first tries to find the attribute in the instance object. (b) If Python fails, it searches the class that it was constructed from. (c) If Python fails, it searches (left before right) upward from the class the object was constructed from (which appears at the bottom of the network), towards the root/object class. (d) If Python reaches any base class and has not already searched all of its derived classes, that base class is not searched now; instead the derived class searches its remaining (to the right) base classes; and if it has searched all its base classes, its derived class searches its remaining (to the right) base classes, etc. So, a base class can be searched only if all its derived classes have already been searched. In the inheritance network above, if c is an object constructed from class C, and the attribute isn't found in c, Python searches the following classes in following order, getting the attribute from the first class that defines it: C, B1, B1a, B2, "object". Here is why: It starts at C, then searches C's leftmost base class B1, then searches B1's leftmost base class B1a. But when it sees B1a's leftmost base class, "object", has another derived class (B2) that has not been searched, Python doesn't search "object" yet. Instead it tries to search B1a's other base classes; there are none, so it tries to search B1's other base classes; there are none, so it tries to search C's other base classes, and there is one, B2. Then it searches B2's base classes, "object", which is now actually searched, because all/both of its derived classes have already been searched. Finally, "object" has no base class to search. How does the search know how to go from a derived class to its base class(es)? Every class object has a __bases__ attribute that is a tuple of its base classes; the order that it uses for the base classes in the tuple is the same order the base classes appear in the class definition. So, the arrows shown in the pictures above are really stored in Python class objects, in the tuple named __bases__. Recall that the arrows go only from derived classes to base classes, not base classes to derived classes. So while every derived class knows the base classes it is derived from, a base class does not know what derived classes are derived from it. Finally, when applied to a single-inheritance hierarchy, this algorithm degenerates (each __bases__ tuple has only one value in it) into looking from a derived class to its single base class, to the single base class of that base class, etc. until reaching the object class. We are already familiar with that ordering/process from the first inheritance lecture. In fact, Python has a function getattr defined in the builtins module (note, no underscores; it is not the dunder method __getattr__ defined in a class) that performs exactly this task. It takes up to 3 arguments: an object (required), the name of an attribute for that object (as a string; required), and the value to return if it cannot find that attribute for an object (optional). If the third argument is not supplied, Python will raise AttributeError if it cannot find that attribute for the object. How Python Actually Searches for Attributes: The order of classes that Python uses to search for an attribute name is actually computed when a class is defined (so the order doesn't need to be recomputed for each attribute access); it is stored in the class attribute __mro__; it is also retrievable by the parameterless method mro. So if C refers to a Class, then C.__mro__/C.mro() is a tuple/list (they return different types of results) of classes (starting with C), in the order they are searched for the attribute, including C itself first. The term "mro" stand for "METHOD resolution order", but it applies to all ATTRIBUTES, not just methods (so a better name would be aro). The code directly below shows how the mro is used when searching for an attribute. It is followed by code that illustrates how to compute the mro for each newly defined derived classes (based on having already computed the mros for its base classes). The pgetattr function (pseudo getattr) defined below shows how Python locates the value associated with any attribute of an_object. It locates the attribute in an_object itself, or searches its inheritance hierarchy starting with the class from which an_object is constructed. The inheritancetool.py module includes this code (and the algorithm below used in computing the __mro__ tuple) Note here how using *default will bind default to () which is the empty tuple, if no third argument is supplied. This is how Python can tell whether or not a third argument was supplied to this function. def pgetattr(an_object, attr, *default): # Try to locate attr in in object itself # Otherwise try to locate it in the classes in the __mro__ tuple # based on the type of an_object (in order), which starts with # type(an_object). # Finally return default[0] (if a third argument, and no more, was specified) or # raise AttributeError if it was not if attr in an_object.__dict__: return an_object.__dict__[attr] else: for c in type(an_object).__mro__: # or call .mro(): uses order from mro if attr in c.__dict__: return c.__dict__[attr] if len(default) == 1: # 3rd argument passed (no others)? return default[0] else: raise AttributeError("'"+type_as_str(an_object)+"' object has no attribute '"+attr+"'") Python also defines a hasattr function in the builtins module, returning a boolean value telling whether or not Python can access the specified attribute in the specified object: whether getattr will find the attribute and return its value. We could defined a pseudo version of this function similarly to pgetattr above. def phasattr(an_object, attr): # Try to locate attr in in object itself # Otherwise try to locate it in the classes in the __mro__ tuple # based on the type of an_object (in order), which starts with # type(an_object). if attr in an_object.__dict__: return True else: for c in type(an_object).__mro__: if attr in c.__dict__: return True return False or we can write it more compactly (and just as efficiently) as def phasattr(an_object, attr): return (attr in an_object.__dict__) or \ any(attr in c.__dict__ for c in type(an_object).__mro__) The existence of the precomputed __mro__ class attribute simplifies these functions. We will discuss how to compute this class attribute below. Finally, the actual getattr/hasattr functions are bit more complicated than pgetattr/phasattr because it works with some Python features that we have not discussed. But the general outline is the same. ------------------------------------------------------------------------------ Computing the Method Resolution Order Now, let us see the algorithm to computer the __mro__ attribute of a derived class, based on the __mro__ attributes of the base classes it is derived from. The final order requires two properties (also stated above). (1) Before searching a base class, all classes derived from it must be searched. (2) Base classes must be searched in the order they appear in each derived class definition. The derived class specifies them in a left to right order, with a left base class searched before a right base class; this left-to-right search order applies to base classes of base classes, etc.; which means all non-common classes in the leftmost mro list will be searched before any non-common classes in the next (to the right) mro. This algorithm is known as the "C3 linearization algorithm". It linearizes (puts into a tuple) the search order for a network of classes in an inheritance hierarchy. It is called C3 because Python has twice changed the order in which classes in an inheritance network are searched: as programmers wrote code in the the language, the notion of the "correct" order evolved. If there is no order that satisfies all these requirements (see below for an example) then the algorithm will detect this fact and Python will raise a TypeError exception when the class itself is defined: the __mro__ attribute for the class is computed when the class is defined. The result of raising the TypeError is that the class will not be defined and cannot be used. The following function computes the mro: I haven included print statements (controlled by the debugging parameter) to help illustrate its computation. # bases is all the bases the new class is derived from def compute_mro(*bases, debugging=False): # constraints is a list of lists; each inner list specifies the constraints # for a base class or the new class (last, specified by *bases) # mro is the final order for searching all the base classes constraints = [list(c.__mro__) for c in bases] + [list(bases)] # order important mro = [] # While there are constraints to ensure are satisfied while constraints: if debugging: print('\nConstraints =',constraints) # Find the first candidate in an inner constraint-list that does not # appear anywhere but as the first in all other inner constraint-lists # If none is found, raise TypeError (note use of else: in for loop) for const in constraints: candidate = const[0] if debugging: print('Trying candidate:',candidate) if not any([candidate in later[1:] for later in constraints]): if debugging: print('Selected candidate:',candidate) break else: # for finished without breaking; no candidate is possible! raise TypeError('Cannot create a consistent method resolution order (MRO) for bases ' +\ ', '.join(type_as_str(b) for b in bases)) # That candidate is next in the mro for the currehnt class mro.append(candidate) # Remove candidate from being the first in any inner constraint-list for const in constraints: if const[0] == candidate: if debugging: print('Removing candidate from:', const) del const[0] # If any innner constraint-list has been reduced to [], remove it constraints = [c for c in constraints if c != []] return tuple(mro) Given the following classes, used at the start of this lecture note class B1a : pass # __mro__ is (B1a,object) class B1(B1a) : pass # __mro__ is (B1,B1a,object) class B2 : pass # __mro__ is (B2,object) We can compute the mro for class C(B1,B2): pass by calling print('\nmro =',compute_mro(B1,B2,debugging=True)) which produces the following results; match it with the English description of how the order (also appearing at the start of the lecture note). Constraints = [[, , ], [, ], [, ]] Trying candidate: Selected candidate: Removing candidate from: [, , ] Removing candidate from: [, ] Constraints = [[, ], [, ], []] Trying candidate: Selected candidate: Removing candidate from: [, ] Constraints = [[], [, ], []] Trying candidate: Trying candidate: Selected candidate: Removing candidate from: [, ] Removing candidate from: [] Constraints = [[], []] Trying candidate: Selected candidate: Removing candidate from: [] Removing candidate from: [] mro = (, , , ) Note that we must prepend the C class itself to the result computed: The actual mro for C is mro = (, , , , ) As a final comment, B1a must be searched before B2 because it is a non-common class in the mro of B1, whose mro will be examine before the mro of B2. The original order of the big constraints list captures this information, with the mro of B1 appearing in the list before the mro of B2. Now, look at the following example of class that CANNOT meet the requirements of having a C3 linearizable mro: class A : pass # __mro__ is (A, object) class B : pass # __mro__ is (B, object) class C(A,B): pass # __mro__ is (C, A, B, object) :note A before B class D(B,A): pass # __mro__ is (D, B, A, object) :note B before A class E(C,D): pass # __mro__ is not possible Note here that for derived class E, the rules for its base class C require searching A before B, but the rules for its base class D require searching B before A, which are incompatible: C says A must be searched before B but D says B must be searched before A. So given the definition of A, B, C, and D, calling We can compute the mro for class E(C,D): pass # __mro__ is not possible by calling print('\nmro =',compute_mro(C,D,debugging=True)) which produces the following results. Constraints = [[, , , ], [, , , ], [, ]] Trying candidate: Selected candidate: Removing candidate from: [, , , ] Removing candidate from: [, ] Constraints = [[, , ], [, , , ], []] Trying candidate: Trying candidate: Selected candidate: Removing candidate from: [, , , ] Removing candidate from: [] Constraints = [[, , ], [, , ]] Trying candidate: Trying candidate: Traceback (most recent call last): Traceback (most recent call last): File "C:\Users\Pattis\workspace\inheritance\inheritancetool.py", line 85, in print('\nmro =',compute_mro(C,D,debugging=True)) File "C:\Users\Pattis\workspace\inheritance\inheritancetool.py", line 68, in compute_mro raise TypeError('Cannot create a consistent method resolution order (MRO) for bases ' + ', '.join(type_as_str(b) for b in bases)) TypeError: Cannot create a consistent method resolution order (MRO) for bases __main__.C, __main__.D In such a case, the class raising the exception will not be defined: it cannot be defined because it can have no legal search order for its attributes, according to the constraints of the C3 linearlization algorithm. Here is a recursive version of the compute_mro function (sans debugging print statements). It defines and calls the recursive helper function merge. Note the use of a for loop with an else; the else is executed if no good constraint is found (if no break is executed). Various recognizable parts of the iterative algorithm appear here too. def compute_mro_r(*bases,debugging=False): def merge(constraints): if debugging: print(constraints) if constraints == []: return [] else: # Find the first candidate in an inner constraint-list that does not # appear anywhere but as the first in all other inner constraint-list for const in constraints: candidate = const[0] if not any([candidate in later[1:] for later in constraints]): break else: # for finished without breaking; no candidate is possible! raise TypeError('Cannot create a consistent method resolution order (MRO) for bases ' +\ ', '.join(type_as_str(b) for b in bases)) # if found, concatenate at front of mro of the solution to merging # the remaining constraints (with it removed; a smaller problem) return (candidate,) + \ merge([const[1:] if const[0] == candidate else const for const in constraints if const != [candidate]]) return merge([list(c.__mro__) for c in bases] + [list(bases)]) Going back to the definitions class B1a : pass # __mro__ is (B1a,object) class B1(B1a) : pass # __mro__ is (B1,B1a,object) class B2 : pass # __mro__ is (B2,object) We can compute the mro for class C(B1,B2): pass by computing compute_mro_r(B1,B2) which we can visualize as (dropping the '__main__.' from each class name) = merge([[B1,B1a,object>], [B2,object], [B1,B2]]) = (B1,) + merge([[B1a,object], [B2,object], [B2]]) = (B1,) + (B1a,) + merge([[object], [B2,object], [B2]]) = (B1,) + (B1a,) + (B2,) + merge([[object], [object]] ) = (B1,) + (B1a,) + (B2,) + (object,) + () = (B1,B1a,B2,object) The actual mro for C is (C, B1, B1a, B2, object) , including the class C itself. Finally, __mro__ is as read-only attribute; after Python computes it, its value cannot be changed (recall all the __setattr__ variants we wrote that restricted updating attributes). Finally, in the next lecture we will see many examples of single-inheritance and multiple-inheritance. As of the end of this lecture, you should be able to understand how Python treats all inheritance hierarchies in terms of locating attributes. ------------------------------------------------------------------------------ The true meaning of the isinstance function (given inheritance) Python's boolean function isinstance has two parameters: the first should be bound to an instance object; the second to a class object. Before inheritance, our understanding of the call isinstance(o,C) meant return True if instance object o was constructed from class object C: we could restate this computation using the is operator as equivalent to: type(o) is C. But now that we know about inheritance we can clarify the meaning of this isinstance function: it is a bit more general. The isinstance(o,C) function call still returns True if type(o) is C, but it also returns True if type(o) is any class derived from C (meaning type(o) is derived from class C in one or more steps). Stated another way, if isinstance(o,C) is True, it means that when searching for attributes in o, eventually class C will be searched (if the attribute is not found in an earlier classes searched, by overriding), by the Fundamental Equation of Object-Oriented Programming generalized in this lecture. So, while type(o) is C truly asks whether o is an instance object constructed from class C, isinstance(o,C) is asking whether C can eventually be searched when trying to find attributes in o. In the following picture isinstance(o,C) is True. C ^ | .....other Base Classes of type(o)..... ^ | type(o): actual class of o Given what we learned above, we can easily implement the operation of calling isinstance(o,C) in Python as C in type(o).__mro__ since type(o).__mro__ contains o's actually class and all the base classes its class is derived from. Note that isinstance(o,object) always returns True, because object is always searched in an inheritance hierarchy; it is always the last class in __mro__. A very interesting fact is that if isinstance(o,C) is True and if a is an attribute defined in class C, then writing o.a will always find a binding for the attribute a (and never raise AttributeError): the binding will either come from C itself (if the search goes all the way up to class C) or from some class derived from C that redefines/overrides this attribute. So, o.a will always be legal if o is defined in a class derived from C. Thus, any method callable on an object from class C will be callable on o. So o is from a class that is C-like. For example, if mc is a Modular_Counter, the we can call mc.m(...) for any m defined in Counter. We said that a derived class represents a special kind of base class, so it should make sense to call any methods specified in the base class on objects constructed from the derived class. The derived class can override an inherited method, but one or the other method will be called: for inc it calls the overriding method in Modular_Counter; for reset it calls the method inherited from Counter. ------------------------------------------------------------------------------ Exceptions and Inheritance In fact, all Exceptions are arranged in an inheritance hierarchy. If we write "except Foo" in a try/except, it handles a raised exception that is from the Foo class or any class derived from Foo. Technically, if x refers to a raised exception object, "except Foo" handles any exception class for which isinstance(x,Foo) is True (what this means is x is an object constructed from the Foo class or any class derived from Foo; remember derived means beneath it in the inheritance hierarchy). So we can create a hierarchy of exception classes, some more general (higher up in the hierarchy, matching more general -many possible- exception classes) and some more specific (lower in the hierarchy). In a try/except statement with multiple except parts, the excepts are checked sequentially: the first one matching (by isinstance) executes its block and no other ones are tried. So, when we write a try/except with many except parts, we must order them in such a way that they are handled correctly (often meaning specific exceptions before general ones). For example, there might be a FileError exception for general file errors. We might define class FileError (Exception) : ... class InputError (FileError ): ... class OutputError(FileError ): ... class EOFError (InputError): ... which would have the following hierarchy (Exception is derived from BaseException which is derived from object). object ^ | BaseException ^ | Exception ^ | FileError ^ ^ / \ InputError OutputError ^ | EOFError Other exceptions that we have seen, like AssertionError and TypeError, are derived from the base class Exception. This is NOT the actual way it is done in Python, but serves as a good/simple illustrative example for the discussion below. You can write somemthing like print(EOFError.__mro__) to see the actual derivation of EOFError. Python checks whether the exception object raised is an instance of the exception named in the except clause. So, for except EOFError, Python checks whether isinstance(raised_exception_object, EOFError). So, if we wrote except EOFError :... handles only EOFError because EOFError is not the base class of any derived class shown above or except InputError:... handles both InputError and EOFError because InputError is the base class of EOFError. So if eofo is an object from the EOFError class, and it is raised, isinstance(eofo, EOFError) is True and isinstance(eofo,InputError) is also True, so both exceptions are handled. or except FileError :... handles FileError, InputError, OutputError, and EOFError because, for example, if eofo is object from the EOFError class, and it is raised, isinstance(eofo, EOFError), and isinstance(eofo,InputError), and isinstance(eofo,FileError) all return True; likewise, if ieo is object from the InputError class, and it is raised, isinstance(ieo,InputError) and isinstance(ieo,FileError). The except clauses are checked SEQUENTIALLY FROM TOP TO BOTTOM. The first one having the isinstance function return True, specifies how that exception is handled (later ones are not tried). So, if we wrote except EOFError :... handle an EOFError one way except FileError :... handle a FileErrorr another way would handle EOFError in one way, and all other FileErrors in another way. In fact, we can write a a single class or a tuple/list of classes after the except keyword. For a single class Python checks whether or note the exception is an instance of that class; for a tuple/list it checks whether the exception is an instance of any class in the list. So except (InputError,EOFError): ...handle either InputError or EOFError one way except OutputError : ...handle OutputError another one way would handle InputError and EOFError in one way, and OutputError another way. In fact, we could simplify this to just except InputError : ...handle either InputError or EOFError one way except OutputError: ...handle OutputError another one way because any EOFError is also an instance of InputError (see the hierarchy). WARNING: The one thing that programmers need to watch out for is writing the following (which just reverses the two except clauses used in an example above). except FileError :... except EOFError :... Here the except EOFError clause will NEVER be tried, because EOFError is derived from FileError, and therefore will be processed by the code in the FileError's except clause. Generally, to process exceptions correctly, they should be listed from most SPECIFIC to most GENERAL (and EOFError is more specific than FileError: it is derived from FileError). The programming language Java uses the same general mechanisms for inheritance and exception handling, but in addition it would issue an error if a more general exception appeared before a more specific one. Java, like Python, knows the inheritance hierarchy of exceptions, and it actually checks for correct usage before running a Java program. This is another form of "static checking" that we will discuss further during the last week of class. ------------------------------------------------------------------------------ 1) Given the following class definitions, draw the inheritance network and indicate in what order the class objects are searched for attributes. class A : pass class B : pass class C(A,object): pass class D : pass class E(D,B) : pass class F(B,C) : pass class G(E,F) : pass 2) Rewrite the pgetattr function to return a list (empty, one value, or multiple values) that contains all the values found in the inheritance hierarchy of the given attribute name. 3) For what second argument will isinstance always return True, no matter what the first argument? 4) We saw that writing except EOFError :...way1 except FileError :...way2 handles EOFError in way1 and FileError in way2. Explain why reversing the order except FileError :...way2 except EOFError :...way1 would cause both FileError and EOFError to be handled in way2.