Introspection (Function Dispatch and Stack Crawling) Introspection is the ability of running code to examine the objects that it defines (e.g., functions, classes, and data) and its behavior (e.g., which function calls have lead Python to the currently executing function). Many introspection features appear in the "inspect" module, which was briefly introduced in the Generator Functions lecture; there we discussed only the isfunction, isgeneratorfunction, and isgenerator predicates. In this lecture we will examine the inspect module in more detail and learn how to use some of its features in real/interesting applications. Read this lecture not to memorize everything, but instead to become aware of the kinds of information introspection can produce, focusing on how this information is used in principle and in the sample code shown below. ------------------------------------------------------------------------------ The following are more "is" function predicates that are callable on objects to determine what they are. These are the most useful functions, but not a complete list. ismodule : is object argument a module? isclass : is object argument a class? ismethod : is object argument a method? iscode : is object argument code (see attributes like __code__ below)? isbuiltin: is object argument a builtin function defined in the builtins module? isroutine: is object either a function or method? (from the word "subROUTINE")? The following "get" methods are also available, to retrieve information about the source code for objects: getdoc, getcomments, getfile, getmodule, getsourcefile, getsourcelines, and getsource. All these functions are described in detail in https://docs.python.org/3/library/inspect.html Below, we will discuss one large section in this document: "Introspecting Callables with the Signature object". With a Signature object, we can examine parameter information (names, "kinds", annotations, etc.) in a function object and also examine information about the bindings to these parameters in a function call. ------------------------------------------------------------------------------ Simple and Useful Attributes (some dunder, some not) Here are some of the attributes for various types of Python objects. I'll list those attributes that are easiest-to-understand and/or most useful (in some of the code in this lecture). For the complete list of attributes, see: https://docs.python.org/3/library/inspect.html It is fairly easy to write small Python programs to experiment with these attributes. See the experiment_functions.py module in the project folder accompanying this lecture. It contains some functions that I wrote to report attributes easily; it also actually defines some small functions and reports their attributes (some of these examples are shown below). Let's start with function objects. If we define the simple function is_even def is_even(x : int = 0) -> bool: """x is even if its remainder is 0, when divided by two""" return x%2 == 0 Here are some of its attributes (sorted alphabetically) and their values. function is_even: __annotations__ : {'x': , 'return': } __closure__ : None __code__ : __defaults__ : (0,) __doc__ : x is even if its remainder is 0 when divided by two __globals__ : {...a large dictionary...} __kwdefaults__ : None __module__ : __main__ __name__ : is_even __qualname__ : is_even Here is the commentary on these attributes (in the same order). __annotations__ : a dict whose iteration order is the order of the parameters: keys are parameter names and associated values are their annotations; the return-type annotation appears last, using the name 'return' in this dict (we cannot define this name in Python, because it is a reserved word). Also see the return_annotation attribute in the discussion of signatures below. __closure__ : It is not None only for functions defined inside other functions: see this attribute for the not_f function below. __code__ : Code written in Python is translated into code objects that comprise sequences of instructions defined for the Python Virtual Machine (PVM). We'll talk more about code objects later in this lecture; during week 10 I will also lecture on the form of PVM instructions and how the PVM executes code objects. It will be a 1-day version of ICS-51 so the coverage is very wide but very shallow. The text in this attribute shows the path to the file of the module defining the function, and on what line in the module the function is defined. __defaults__ : It is either None (if no positional parameters have default values) or a tuple of length N, indicating the defaults of the the LAST N positional arguments. Once one parameter specifies a default value, all the subsequent parameters must too. __doc__ : If a function has a docstring, it is bound to this attribute __globals__ : Global names defined in the module, usable by the function. This dictionary typically large, so I elided it above. __kwdefaults__ : If the function specifies a ** parameter, this specifies its default value; it is None if there is no ** parameter or the ** parameter specifies no default . __module__ : The module a function is defined in: typically __main__ (if defined in the module executed) or the name foo (if the function was defined in foo.py) __name__ : Simple name of the function (as it is defined) __qualname__ : Qualified name of the function: different if the function was defined inside another scope; see this attribute for the not_f function defined in the scope of another function Instead of showing a function with a complicated parameter structure (you are welcome to experiment in the experiment_functions.py module), let's instead define another simple function, but one that defines and returns another function (so, not so simple). def opposite_of(f : callable) -> callable: def not_f(x): return not f(x) return not_f Here are some of its attributes (sorted alphabetically) and their values. function opposite_of: __annotations__ : {'f': , 'return': } __closure__ : None __code__ : __defaults__ : None __doc__ : None __globals__ : {...a large dictionary...} __kwdefaults__ : None __module__ : __main__ __name__ : opposite_of __qualname__ : opposite_of Note that writing odd = opposite(is_even) we have the name odd bound to the not_f function defined in (and returned by) this call of opposite_of. The attributes of this inner function (named by odd) are function not_f: __annotations__ : {} __closure__ : (,) __code__ : __defaults__ : None __doc__ : None __globals__ : {...a large dictionary...} __kwdefaults__ : None __module__ : __main__ __name__ : not_f __qualname__ : opposite_of..not_f Note particularly for odd (bound to the returned not_f function) (a) the __closure__ (don't worry too much about the details) indicates a function object (f, the argument of opposite_of in the scope enclosing the definition of not_f). We previously described closures as dicts, but in fact they are tuples of something called a "cell'; but, since we won't study cells, thinking about closures as a dict is still a useful idea. (b) the __qualname__ is basically (don't worry too much about the details) opposite_of.not_f: a not_f function that is defined inside the opposite_of function; Python defines/creates a new function object, bound to not_f, every time that the opposite_of function is called. ------------------------------------------------------------------------------ Signatures: Now, let's switch to a discussion of how to inspect the details of a signature (its parameters and return type) for a function object. The inspect module has a function named "signature" that returns a object from the inspect.Signature class. Suppose that we write sig = inspect.signature(f) # f should be bound to some function/method object what can we do with the resulting sig object (which is of class Signature)? (a) str(sig) : the str version of the signature (b) inspect._empty : a special class used as the annotation value for any parameter name that is not annotated; also can be specified as inspect.Signature.empty (c) sig.return_annotation : annotation for the return type (or inspect._empty, if there is no annotation for the return type) (d) sig.parameters : OrderedDict whose keys are parameter names (it is iterable in the same order that they appear in the function definition), each associated with a Parameter object. Assuming ps = inspect.signature(f).parameters what can we do with ps, assuming the the function f has a parameter named x? (1) ps['x'].name 'x' (2) ps['x'].default default value or inspect._empty if no default value (3) ps['x'].annotation annotation or inspect._empty if no annotation (4) ps['x'].kind either POSITIONAL_ONLY if it appears in signature BEFORE special / POSITIONAL_OR_KEYWORD standard Python parameter VAR_POSITIONAL if it appears in signature as *name KEYWORD_ONLY if it appears in signature AFTER * or *name VAR_KEYWORD if appears in signature as **name With this information about a function, Python can determine the bindings of arguments to parameters, given any function call (we discussed these powerful rules in the first week's lecture). (e) sig.bind(*args,**kargs): returns an object from the inspect.BoundArguments class. Here is what can we do with this object. def def f(a,b,*c,d=10): pass # for simplicity, I have omitted annotations in f ba = inspect.signature(f).bind(1,2,3,4,5,6) Binds the arguments to the parameters (a is bound to 1; b is bound to 2; and c is bound to (3,4,5,6)) ba.arguments Returns OrderedDict([('a', 1), ('b', 2), ('c', (3, 4, 5, 6))]) The parameters with no matching arguments, but default values, don't appear here yet: see ba.apply_defaults() below. ba.args Returns (1, 2, 3, 4, 5, 6) - tuple of all positional arguments ba.kwargs Returns {} - since there is no **kargs parameter ba.signature Returns -> (a, b, *c, d=10) - Signature object (has no annotation) ba.apply_defaults() ba.arguments The function call returns None but updates the ba object, so ba.arguments now returns OrderedDict([('a', 1), ('b', 2), ('c', (3, 4, 5, 6)), ('d', 10)]) Finally, trying running the experiment_functions.py module that I supplied. Experiment by writing your own functions and printing their attributes using the functions that I supplied. ------------------------------------------------------------------------------ Dispatching Function Calls via their Signature: A Big Example Suppose that we want to define a function whose parameters are either 2 integers or 2 strings: for integers it computes their sum; for strings it computes the sums of their lengths; for any other types it raises TypeError (e.g., it is an error to pass one int and one str). We could write this as a single function in Python that "decodes" the types of its arguments. def sum(a,b) -> int: if type(a) is int and type(b) is int: return a+b elif type(a) is str and type(b) is str: return len(a) + len(b) else: raise TypeError(f'sum arguments wrong types: {type(a)} and {type(b)}') A "statically-typed" programming language (e.g., C++ and Java) requires that each function definition must specify the type of argument allowed to match each parameter. The sequence of types is called the "signature" of the function. The same function name can be defined multiple times, but each definition must specify a different signature. In such languages, we are said to "overload" the meaning of the function name based on the different types of arguments it can be called on. For each function call, the language checks whether the argument types match the signature of exactly one function with that name; if so, that function is called with the arguments; otherwise a mismatch is detected before the program is run (when it is compiled/translated). In a statically typed language, we could overload the sum function for two int and two str parameters by writing something like* the following in Python. def sum(a : int, b : int) -> int: return a+b def sum(a : str, b : str) -> int: return len(a) + len(b) *(Of course, if we actually wrote this code in Python, the second definition would rebind the name f, so no name would be bound to the first function object; we can only bind sum to one function object at a time; we will see how to overcome this problem below, with decorators. Languages like C++ and Java allow overloaded function declarations like the ones above, keeping them separate. So, we are exploring how to program Python to mimic these languages.) In a statically-typed language, if we wrote sum(1,1) it would know to call the first version of sum (returning 2), and if we wrote sum('abc','xyz') the second (returning 6). If we wrote sum(1,'xyz') it would also know that no signatures for sum match. The process of determining which function to execute, based on the function name and the types of its arguments, is called "dispatching". In actual statically-typed languages, the dispatching decision of "which sum function" to call is made once, at compile/translation time, not each time the function is called, at run time. We will explore this topic more during the 10th week of the quarter, showing advantages and disadvantages of statically-typed programming languages. In this section of the notes, we will examine the classes DispatchableFunction and DispatchBySignature. By using the DispatchBySignature.register decorator, we will be able to define these two overloaded versions of the sum function (remembering both and calling each when appropriate) in Python as follows. @DispatchBySignature.register def sum(a : int, b : int) -> int: return a+b @DispatchBySignature.register def sum(a : str, b : str) -> int: return len(a) + len(b) Now, if we write sum(1,1) Python will call the first function and return 2; if we write sum('abc','xyz') Python calls the second function and returns 6. In each case, Python determines which of the overloaded sum functions should actually be called (based on the types of the supplied arguments). Finally, if we write sum(1,'xyz') Python would recognize that these arguments match none of the overloaded functions, so it would raise a TypeError exception. Basically, the DispatchBySignature class below defines a static decorator function named register. Calling register takes each function and uses its name and signature (via a 2-tuple) as a key in a dictionary: each key is associated with the function object for its name and signature. sum is rebound (twice) to a DispatchableFunction object. Later, when this object is called(via sum(...)), the __call__ in DispatchableFunction uses the function name and argument types to find the matching name and signature in the dictionary built in DispatchBySignature, and then calls the function in the dictionary associated with that name and signature. Examine the DispatchSignatureClass.py module in the code downloadable for this lecture. It does its job using the .__name__ attribute of a function, and the inspect.signature and the .parameters attribute of the resulting signature. The two classes it defines (DispatchableFunction and DispatchBySignature) work together to dispatch function calls by their argument types. The actual module is extensively commented, but in the discussion below I show only the raw code, with no comments, but with English explanations. It would be useful to view both code sources simultaneously. The entire DispatchableFunction class (a decorator for functions) appears below. We will soon see how the register method in the DisplatchBySignature class both: (a) constructs a DispatchableFunction object, supplying its __init__ with both a function object (raw_function) and the class to use for dispatching its calls (dispatch_class), based on the signature of the call's arguments (b) remembers the function by its name and signature (parameter types specified by its annotations), so it can dispatch function calls with this name, when their argument types match one of its signatures The __call__ function in DispatchableFunction uses the dispatch_class (supplied to __init__) to find and then call the appropriate function object: the one with the correct signature. class DispatchableFunction: def __init__(self, raw_function, dispatch_class): self.raw_function = raw_function #decoration information for function self.dispatch_class = dispatch_class def __call__(self, *args, **kargs): df = self.dispatch_class.dispatch(self, *args, **kargs) # find it return df.raw_function(*args, **kargs) # call it Now we will examine the more complicated DispatchBySignature class, in parts. (a) Why no objects of DispatchBySignature are created (b) How signatures are stored as keys in a dictionary (c) How functions are registered in the class by the key of their signatures (d) How this class finds the correct function to call/dispatch by signatures (a) Why no objects of DispatchBySignature are created The DispatchBySignature class defines the attribute function_map, which is used for storing all the overloaded functions (one of which is dispatched for each function call, based on the function name and its signature: argument types). It also defines 3 static methods. These static methods can access the function_map directly, so no instances of this class need to be constructed. In fact, if an object is constructed, the __init__method immediately raises an exception. So, this class stores its data and 3 static methods, which are are accessed by prefacing them with the class name, like DispatchBySignature.function_map (the code below has more examples). class DispatchBySignature: function_map = {} # Updated by register method (used as a decorator) def __init__(self): raise Exception('DispatchBySignature.__init__: Attempted to construct an object from this class') This class mechanism works beautifully in Python because attributes can be bound to classes: a class is just a kind of object. In other languages (e.g., C++ and Java), this approach will not work; in such languages, we must use a Singleton Class pattern. You can see how to implement these two classes similarly in Python, using a Singleton Class pattern, in the module DispatchSignatureWithSingletonClass.py. (b) How signatures are stored as keys in a dictionary The main purpose of the DispatchBySignature class is to store the function_map dict, which will contain all the overloaded functions that are registered. Their dispatch is also controlled by this class and its dict. The helper "key" function determines the keys for the function_map dict by using signatures of the functions it stores. There are 2 different cases for how the key of a raw_function is computed: (1) If the function is passed WITHOUT ARGUMENTS, it is being registered. We use its SIGNATURE'S ANNOTATION to compute the types of the parameters in the key (using the type "object" for any un-annotated parameter). See (c) below for details. (2) If function is passed WITH ARGUMENTS, it is being called. We use the TYPES OF ALL THE ARGUMENTS to compute the key. See (d) below for details. Actually, the full key is a tuple of the function's name, followed by a tuple of its signature/arguments types. For example, the key for the int version of sum would appear as: ('f', (, )) and the key for the str version would appear as ('f', (, )). @staticmethod def key(raw_function, args=None): if args != None: types = tuple([type(a) for a in args]) else: signature = inspect.signature(raw_function).parameters types = tuple([(spec.annotation if spec.annotation != inspect._empty else object) for _,spec in signature.items()]) return tuple([raw_function.__name__, types]) (c) How functions are registered in the class by the types in their signatures Recall that we use DispatchBySignature.register as a decorator for any function definition that we want to dispatch by its signature. This method constructs a new DispatchableFunction object, then computes its key using its signature (see b); it ensures that this same signature has not already been registered: we cannot have two functions with the same name and signature. Then, in function_map it associates the key with the DispatchableFunction object; and finally it returns the DispatchableFunction (which the decorator rebinds to the name of the function being defined). So in the future, any call of a function with that name will execute the __call__ in the DispatchableFunction, which will call dispatch in part (d) below. @staticmethod def register(raw_function): df = DispatchableFunction(raw_function, DispatchBySignature) sig_key = DispatchBySignature.key(raw_function, args=None) if sig_key in DispatchBySignature.function_map: raise TypeError('DispatchBySignature.register: attempt to re-register a signature =',sig_key) DispatchBySignature.function_map[sig_key] = df return df Let's examine this process in detail, when registering two overloaded functions (overloading the same name). After writing @DispatchBySignature.register def sum(a : int, b : int) -> int: return a+b the name sum is now bound to a DispatchableFunction object, which refers to the DisplayBySignature class, whose function_map attribute now stores the key ('f', (, )) associated to this DispatchableFunction. Next, after writing @DispatchBySignature.register def sum(a : str, b : str) -> int: return len(a) + len(b) the name sum is now bound to a new DispatchableFunction object, which refers to the DisplayBySignature class, whose function_map attribute now stores two keys ('f', (, )) associated to the 1st DispatchableFunction ('f', (, )) associated to the 2nd DispatchableFunction Of course, we can decorate more definitions of sum, as long as they each specify a different signature. Note that regardless to what DispatchableFunction object sum is ultimately bound to, when we call sum(...) the __call__ for sum will look at the function_map attribute in the DispatchBySignature class to determine which of the registered functions with that name to call: the one with the matching signature. (d) How this class finds the correct function to call/dispatch by signatures When the DispatchableFunction object bound to sum is called (see __call__) it calls the dispatch method in self.dispatch_class (in this example, this attribute it is bound to DispatchBySignature). Here is how that method works. The dispatch function finds the underlying raw_function object, and computes its key using the actual arguments in the function call (only positional (arguments: **kargs is ignored). It calls "get" on function_map with this key (specifying to return None if the key is not found). If the key is found, its associated DispatchableFunction object is returned to __call__ (where its function object is actually called and its result returned as __call__'s result); if there is no matching key, unique_f_to_call is None, so this function raises the TypeError exception. @staticmethod def dispatch(df, *args, **kargs): raw_function = df.raw_function arg_types = DispatchBySignature.key(raw_function, args=args) unique_f_to_call = DispatchBySignature.function_map.get(arg_types, None) if unique_f_to_call != None: return unique_f_to_call else: raise TypeError("DispatchBySignature.dispatch: No matching functions located for",arg_types,'in',[a for a in DispatchBySignature.function_map]) IMPORTANT: The actual dispatch method in DispatchBySignature includes some code that is executed if no exact match is found for the argument types. It uses inheritance (covered in Week 8) to attempt to find a unique function to call whose signature is "good-enough" using calls to isinstance. I will try to return to look at this code briefly after we have finished our lectures on inheritance. Finally, try running this code on the test-bed that I supplied. Experiment by writing/registering different named functions with different signatures and examine how function calls are processed. It might be useful to use the debugger to understand how all these classes/methods cooperate to select the correct function to execute. ------------------------------------------------------------------------------ Function Calls and the Stack: Looking at Stack Frame attributes In the first lecture on recursion, we informally introduced the use of pictures of call frames to help us hand simulate recursive function calls. Every time a function was called, we added a call frame for it to the picture. Each call frame stored/showed information about the function call, including the current bindings of its parameters (and local variables, if there are any) and an indication of what value the function would return. In fact, Python automatically uses a "stack" data structure to store/manipulate call frames that we can inspect. Below, we will examine function calls and the stack of call frames more formally, using the following rules: (a) All call frames in a program are stored in stack, which grows upward (unlike our stack in the recursion lecture, which grew downward). (b) Python is always executing code in the call frame on the top of the stack. (c) Before Python starts executing code in a main module, it creates a stack of one call frame for that module, treating it like a called function with local names (for the global variables defined in the module) and their bindings, but no parameters. Python starts executing code in this unique call frame, which at the time it executes is on the top (and its also the bottom!) of the stack. (d) Whenever the code executing in the top call frame calls another function, Python remembers where the top call frame is executing and then adds (aka "pushes") a new call frame on top of it in the stack, which becomes the new call frame on the top of the stack in which code is run; it stores the bindings of the parameters/local names in this newly-called function and starts executing it body. (e) Whenever the code executing in the top call frame returns, Python removes (aka "pops") that call frame from the stack. Python then continues execution, where it left off, in in the call frame beneath it, which is now the call frame on top of the stack. NOTE: If the call frame executing on the top of the stack is also on the bottom of the stack (the unique call frame from the main module) when it "returns" (there are no more statements in it to execute), it is popped off the stack, which becomes EMPTY. There are no call frames in which to run code, so Python terminates running the program: in the console tab you will see that the program has terminated. ------------------------------ Let's first look at the details of applying these rules for a simple, non-recursive example using two functions. Suppose that Python executes the four statements in the following module: def f(x): return 3*g(x+1) def g(x): return 2*x a = 1 print(f(a)) Here is how Python manipulates the stack of call frames. First, it creates a stack with one call frame for executing the module. In that call frame it binds the names f and g to their function objects (not shown below, so the call frames aren't too complicated) and then binds the name a to 1: here I represent a's reference to the int object storing 1 by just writing 1 in a's box; by doing so the pictures are less cluttered. +-------------------------------+ | Main Module | Call Frame on Top and Bottom of Stack | a | | +------+ | | | 1 | | | +------+ | +-------------------------------+ To call print, Python must first compute the value of f(a) and then bind it to print's parameter. Python computes the value of the argument (1) and pushes a new call frame for computing f on the top of the stack and starts executing its code. +-------------------------------+ | Call of f(x) | Call Frame on Top of Stack | x value to return | | +------+ +----------------+ | | | 1 | | 3*g(x+1) | | | +------+ +----------------+ | +-------------------------------+ | Main Module | Call Frame at Bottom of Stack | a | | +------+ | | | 1 | | | +------+ | +-------------------------------+ To compute f(1), Python next calls g(x+1) to help compute the value f returns. Python computes the value of the argument x+1 (2) and pushes a new call frame for computing g on the top of the stack and starts executing its code. +-------------------------------+ | Call of g(x) | Call Frame on Top of Stack | x value to return | | +------+ +----------------+ | | | 2 | | 2*x | | | +------+ +----------------+ | +-------------------------------+ | Call of f(x) | | x value to return | | +------+ +----------------+ | | | 1 | | 3*g(x+1) | | | +------+ +----------------+ | +-------------------------------+ | Main Module | Call Frame at Bottom of Stack | a | | +------+ | | | 1 | | | +------+ | +-------------------------------+ Now g computes and returns the value 4 without calling any more functions. Python pops g's call frame off the top of the stack, returning the value 4 to f, whose call frame is now back on the top of the stack. I've replaced the call g(x+1) by 4. Execution resumes in f's call frame, where it stopped. +-------------------------------+ | Call of f(x) | Call Frame on Top of Stack | x int value to return | | +------+ +----------------+ | | | 1 | | 3*4 | | | +------+ +----------------+ | +-------------------------------+ | Main Module | Call Frame at Bottom of Stack | a | | +------+ | | | 1 | | | +------+ | +-------------------------------+ Now f resumes its computation, computing 3 times the result returned from calling g(x+1). Python pops f's call frame off the top of the stack, returning the value 12. Now Python calls print(12). It pushes a new call frame for computing print on the top of the stack and starts executing its code. +-------------------------------+ | Call of print(12) | Call Frame on Top of Stack | Details for print function | | elided. | +-------------------------------+ | Main Module | Call Frame at Bottom of Stack | a | | +------+ | | | 1 | | | +------+ | +-------------------------------+ Now Python prints 12 on the console and then pops the print's call frame off the top of the stack, because it has finished executing. +-------------------------------+ | Main Module | Call Frame at Top/Bottom of Stack | a | | +------+ | | | 1 | | | +------+ | +-------------------------------+ There are no more statements to execute in this module, so it "returns", popping its call frame off the top of the stack, so the stack becomes empty. Python has no call frame to execute in, so it terminates the program. In summary, stacks change in two ways: they either grow by pushing on the top (a new function that is being called) or shrink by popping off the top (returning from the currently executing function call). It is important to note that BEFORE Python calls a function (e.g., print) it evaluates each of it arguments (which if they call a function, pushes a new call frame on the stack; here the function f is called and returns before print can be called). ------------------------------ Next, let's look at a simple recursive example. Suppose that Python executes the two statements in the following module: def factorial(n : int) -> int: if n == 0: return 1 else: return n*factorial(n-1) print(factorial(3)) Instead of showing all the steps in detail, let me just illustrate what the stack of call frames looks like when Python starts executing factorial for the base case: n = 0. This picture results from the main method call factorial(3), which calls factorial(2), which calls factorial(1), which calls factorial(0). +------------------------------+ | Call of factorial(0) | Call Frame on Top of Stack | n value to return | | +-----+ +----------------+ | | | 0 | | 1 | | | +-----+ +----------------+ | +------------------------------+ | Call of factorial(1) | | n value to return | | +-----+ +----------------+ | | | 1 | | 1*factorial(0) | | | +-----+ +----------------+ | +------------------------------+ | Call of factorial(2) | | n value to return | | +-----+ +----------------+ | | | 2 | | 2*factorial(1) | | | +-----+ +----------------+ | +------------------------------+ | Call of factorial(3) | | n value to return | | +-----+ +----------------+ | | | 3 | | 3*factorial(2) | | | +-----+ +----------------+ | +------------------------------+ | Main Module | Call Frame at Bottom of Stack | Details elided, executing | | just the print function | | AFTER computing its | | argument, factorial(3) | +------------------------------+ At this point Python pops 4 call frames off the stack, one after another, ultimately computing 6 as the result of factorial(3). Now lets look at how we can use the inspect module (and frame attributes) to examine call frames when computing factorials. First, here is what the experiment_functions.py module returns when inspecting the factorial function, at the time when it is executing code in the base case. Here are some of a call frame's attributes (sorted alphabetically) and their values. First, note that the factorial function is defined with the following line numbers (which appear in some attributes below). 103 def factorial(n : int) -> int: 104 if n == 0: 105 return 1 106 else: 107 return n*factorial(n-1) 108 109 print(factorial(3)) factorial_frame TOP: f_back : f_builtins : {...all builtin names visible in function's call frame...} f_code : f_code.co_name : factorial f_globals : {... all globals visible in function's call frame...} f_lasti : 18 f_lineno : 105 f_locals : {'n': 0} f_trace : None factorial_frame TOP-1: f_back : f_builtins : {...all builtin names visible in function's call frame...} f_code : f_code.co_name : factorial f_globals : {... all globals visible in function's call frame...} f_lasti : 36 f_lineno : 107 f_locals : {'n': 1} f_trace : None factorial_frame TOP-2: f_back : f_builtins : {...all builtin names visible in function's call frame...} f_code : f_code.co_name : factorial f_globals : {... all globals visible in function's call frame...} f_lasti : 36 f_lineno : 107 f_locals : {'n': 2} f_trace : None factorial_frame TOP-3: f_back : > f_builtins : {...all builtin names visible in function's call frame...} f_code : f_code.co_name : factorial f_globals : {... all globals visible in function's call frame...} f_lasti : 36 f_lineno : 107 f_locals : {'n': 3} f_trace : None factorial_frame BOTTOM: f_back : None f_builtins : {...all builtin names visible in function's call frame...} f_code : at 0x000001A648AB70E0, file "C:\...\experiment_functions.py", line 1> f_code.co_name : f_globals : {... all globals visible in function's call frame...} f_lasti : 232 f_lineno : 109 f_locals : {...all local binding available in the module...} f_trace : None Here is the commentary on these attributes (in the same order). Note these are all attributes, not dunder attributes; all frames have the same attributes. f_back : The call frame in the stack underneath this one: the function that called this one. This attribute is None for only the call frame at the bottom of the stack, representing the script in the main module where execution starts. f_builtins : {...all builtin names visible in function's call frame...} f_code : We'll examine code objects below and use them in an example. f_code.co_name : Here is an example of using the attribute f_code with its co_name attribute: it is either the name of the function for this call frame or , for the call frame at the bottom of the stack. f_globals : {... all globals visible in function's call frame...} f_lasti : Where Python is executing the byte_code for this function; during week 10 I will also lecture on how the Python Virtual Machine executes the code in code objects: it will be a 1-day version of ICS-51 so the coverage is very wide but very shallow. The text in this attribute shows the path to the module defining the function, and on what line in the module the function is defined. f_lineno : The line in the function/module that Python is executing. All calls of factorial except the one at the top of the stack are line 107: the recursive call. f_locals : {...all local bindings available in the module...} See the values for n in the top 4 call frames in the stack. f_trace : Don't worry about this attribute There are many functions that return call frames. Here are the most useful ones. (1) inspect.currentframe(): Returns the call frame: the one for the function evaluating this function call. Note that given this call frame we can use its f_back attribute to get the call frame for the function calling it; doing this repeated gets us all the way back to the bottom call frame executing the main module. (2) inspect.stack() Returns a list of FrameInfo (named tuple), each looking like: FrameInto(frame, filename, lineno, function, code_context, index) Mostly we are interested in the frame attribute of this named tuple. Index 0 in the list is FrameInfo on the top of stack; index len(...)-1 is the FrameInfor at the bottom of the stack. So, inspect.stack()[0].frame is equivalent to inspect.currentframe(). By using inspect.stack(), we can examine all the stack frames via iterating through a list, instead of repeatedly looking at the f_back attributes. ------------------------------ The __code__ attribute for functions (and f_code attribute for frames) refer to objects constructed from the same function. We previously discussed the is_even function and it attributes, which I'll repeat here. def is_even(x : int = 0) -> bool: """x is even if the remainder is 0 when divided by two""" return x%2 == 0 function is_even: __annotations__ : {'x': , 'return': } __closure__ : None __code__ : __defaults__ : (0,) __doc__ : x is even if the remainder is 0 when divided by two __globals__ : {...a large dictionary...} __kwdefaults__ : None __module__ : __main__ __name__ : is_even __qualname__ : is_even Now we will explore the __code__ attribute in more detail. The value of this attribute is an object storing other attributes, shown below in alphabetical order (there are no dunder attributes). function is_even code: co_argcount : 1 co_cellvars : () co_code : b'|\x00d\x01\x16\x00d\x02k\x02S\x00' co_consts : ('x is even if the remainder is 0 when divided by two', 2, 0) co_filename : C:\ZData\UCI Classes from 2019\2020-4 Fall ICS-33, 90, 193\ICS-33\zeclipse-workspace\Introspection\experiment_functions.py co_firstlineno : 76 co_flags : 67 co_freevars : () co_kwonlyargcount : 0 co_lnotab : b'\x00\x02' co_name : is_even co_names : () co_nlocals : 1 co_posonlyargcount : 0 co_stacksize : 2 co_varnames : ('x',) Here is the commentary on these attributes (in the same order). Note these are all attributes, not dunder attributes. co_argcount : number of arguments (not including keyword only arguments, * or ** args) co_cellvars : tuple of names of cell variables (referenced by enclosed function definitions); in opposite_of it is ('f',) co_code : the actual byte code for the Python Virtual Machine; we'll discuss it in more detail in week 10 co_consts : tuple of constants used in function: here the docstring co_filename : the name of the file where this function object appears co_firstlineno : Line number in co_filename where this function is defined co_flags : Ignore this (or read about it in the documentation) co_freevars : tuple of names of free variables (referenced via a function's closure); in not_f it is ('f',) co_kwonlyargcount : number of keyword only arguments (not including ** arg) co_lnotab : Ignore this (or read about it in the documentation) co_name : is_even co_names : tuple of name of local variables (not including parameters) co_nlocals : number of local variables (including parameters) co_posonlyargcount : number of positional only arguments co_stacksize : Stack size in Python Virtual Machine to execute function co_varnames : tuple of names of parameters and local variables ------------------------------ We will now look at two examples of stack introspection, which ultimately will lead us to look at code objects. -----__setattr__ for privacy First, suppose that we wanted to define a __setattr__ method for a class C so that it raised an AssertionError exception if __setattr__ was called from a method not defined in the class (__setattr__ should work normally when called from methods defined in the class) . Here is such a class, using the attribute x it defines, and some attempts to manipulate x outside of methods in the class. import inspect class C: def __init__(self,x): self.x = x def bump(self): self.x += 1 def __setattr__(self,attr,value): code_calling_setattr = inspect.currentframe().f_back.f_code code_from_this_class = self.__class__.__dict__.get(code_calling_setattr.co_name,None) assert code_from_this_class != None\ and code_calling_setattr is code_from_this_class.__code__,\ f'Attempt to set attribute "{attr}" from outside class {self.__class__}' self.__dict__[attr] = value o = C(5) # self.x = in __init__ should work o.bump() # self.x += 1 in bump should work try: o.x = 1 # assignment, not in C's methods, should fail except: print("Assignment o.x = 1 outside of methods in class C failed") # Remove o.x = 1 and try the following def bump(obj): obj.x += 1 try: bump(o) # Same method name as in C, but different code, should fail except: print("Call bump(o) with same name as a method in class C failed") Note that the attribute self.x should be successfully updated in both the __init__ and bump method (the one defined in C), but not anywhere else: here, not in the module nor in a function that happens to have the same name. Note code_calling_setattr = inspect.currentframe().f_back.f_code computes the f_code attribute of the call frame one "back" from the __setattr__ call frame: the call frame in which __setattr__ was called. And, code_from_this_class =\ self.__class__.__dict__.get(code_calling_setattr.co_name,None) first computes the name of the function for the code calling __setattr__ and then "gets" the method with that name in in the __class__ dictionary (which might returns None, if that name is absent). Finally the assertion assert code_from_this_class != None\ and code_calling_setattr is code_from_this_class.__code__,\ f'Attempt to set attribute "{attr}" from outside class {self.__class__}' says that for the call to __settattr_ to work, it must "get" the appropriately named method in the class, and that method's code should be the same object (not just be the same code) as the code that is called in the call frame calling __setattr__. Why both? Note that if we defined the bump function outside of C def bump(obj): obj.x += 1 If we called bump(o), __setattr__ would find the name bump among C's methods, but its code object would be different from the bump method (a different object even if the code in the body is the same). So, because the bump function trying to execute is different code from the bump method, the __setattr__ method raises the AssertionError exception. -----__getattribute__ for privacy How about also requiring that all accesss to attributes be in methods of the classes that defined them? That is harder (but shown below) for a couple of reasons; one of which is the difference between __getattr__ (which we know) and __getattribute__ (which we don't). Remember that __getattr__ is called when Python cannot find the named attribute; how does Python try to find it? By calling __getattribute__. We will now define a __getattribute__ that almost does the job, and then fix it. It's code follows the design of __setattr__, with only the last statement much different. def __getattribute__(self,attr): code_calling_setattr = inspect.currentframe().f_back.f_code code_from_this_class = self.__class__.__dict__.get(code_calling_setattr.co_name,None) assert code_from_this_class != None\ and code_calling_setattr is code_from_this_class.__code__,\ f'Attempt to get non-method attribute "{attr}" from outside class {self.__class__}' return self.__dict__[attr] The first problem here is that all method attributes (for example, bump, in the call o.bump(1)) must be accessible outside of methods in the class. To allow this, we can slap the checking code inside an if, so that the stack isn't checked for attributes that are methods. if not inspect.ismethod( self.__dict__[attr] ): The second problem is much more subtle. Normally, classes inherit (a topic we will cover soon) the standard __setattr__ and __getattribute__ methods from a class called "object". So, if a class does not define a __setattr__ method, then the standard one is used when object of that class access their attributes (something we've said before, without discussing inheritance, which is how it finds that standard method). It is the __getattribute__ method that looks for the attribute in the object first, and then looks in the class from which the object was constructed. But, notice in the if statement above, we access self.__dict__; since type(self) is C, it means that the access self.__dict__ calls C.__getattribute__(self,'__dict__')) which means that __getattribute__ recursively calls itself and will ultimately exceed the maximum stack depth! To avoid this problem we need to access this attribute using the standard __getattribute__ method in the object class, so we must write it as if not inspect.ismethod( object.__getattribute__(self,attr) ): And, there are two other places where an attribute of self must be found in the code above (a third is {self.__class__ inside the assertion error message). All must be rewritten using the object.__getattribute__ form, as follows. def __getattribute__(self,attr): if not inspect.ismethod(object.__getattribute__(self,attr)): code_calling_setattr = inspect.currentframe().f_back.f_code code_from_this_class = object.__getattribute__(self,'__class__').__dict__.get(code_calling_setattr.co_name,None) assert code_from_this_class != None\ and code_calling_setattr is code_from_this_class.__code__,\ f'Attempt to get non-method attribute "{attr}" from outside class {object.__getattribute__(self,"__class__")}' return object.__getattribute__(self,attr) This code will allow only methods in C to access its non-method attributes. As one final optimization, we might want to use the object.__getattribute__ form inside __setattr__, just to avoid all the extra work that __getattribute__ does when __setattr__ accesses attributes of self (of type C). There are two access to self.__dict__: in the code_from_this_class assignment and in the last statement. This is what that code looks like def __setattr__(self,attr,value): code_calling_setattr = inspect.currentframe().f_back.f_code code_from_this_class = object.__getattribute__(self,'__class__').__dict__.get(code_calling_setattr.co_name,None) assert code_from_this_class != None\ and code_calling_setattr is code_from_this_class.__code__,\ f'Attempt to get non-method attribute "{attr}" from outside class {object.__getattribute__(self,"__class__")}' object.__getattribute__(self,'__dict__')[attr] = value Again, languages like Java and C++ specify that attributes can be rebound only by methods defined in their class. In Python, we can implement that policy if we want, but we can implement other/different policies too. So Python gives us the power to define the behavior of attributes in our classes. What is the cost for having this power? It takes longer time to execute the code when the rules are not built-in. Is it better to be more flexible or efficient? It depends: different languages are useful in different contexts. When we cover inheritance, we will learn how write this __setattr__ in a class that other classes can inherit from, to allow them to have this kind of safety easily. Also, by defining the __getattribute__ (not __getattr__) method, we can ensure attributes are accessed only by methods in the class they are defined in. ------------------------------ Finally, let's examine a decorator for functions that we can use to track recursive calls in the stack (we could use such information to try to determine whether or not the function is recursive, by seeing whether the function appears many times in a call stack). Here is such a decorator and some examples of using it. import inspect class track_stack: def __init__(self,f): self.f = f # function being decorated self.calls = 0 # count of calls to f self.max_stack = 0 # count of maximum # of calls to f in the stack self.args_to_dump_stack = None def dump_stack_when_args(self,args_to_dump_stack): self.args_to_dump_stack = args_to_dump_stack def __call__(self,*args,**kargs): if args == self.args_to_dump_stack: print(80*'-') print('Dumping stack (from top to bottom) showing function name in each call frame', 'because *args=',args) for n,f in enumerate(inspect.stack(),0): print(f'{"Bottom" if f.frame.f_code.co_name == "" else n if n != 0 else "Top":>8}', f.frame.f_code.co_name, f.frame.f_locals if self.f.__code__ is f.frame.f_code else "")) print('Dumping stack finished') print(80*'-') self.calls += 1 in_stack_count = 0 for f in inspect.stack(): if self.f.__code__ is f.frame.f_code: in_stack_count += 1 if in_stack_count > self.max_stack: self.max_stack = in_stack_count return self.f(*args,**kargs) @track_stack def factorial(n : int) -> int: if n == 0: return 1 else: return n*factorial(n-1) factorial.dump_stack_when_args((0,)) # dump only for base case of factorial n = 5 print(f'\nfactorial({n}) =',factorial(n)) print('# of calls to this factorial function =',factorial.calls) print('Maximum # of calls to this factorial function ever in the stack =',factorial.max_stack) @track_stack def fibonacci(n : int) -> int: if n == 0 or n == 1: return 1 else: return fibonacci(n-1) + fibonacci(n-2) # Don't ever dump the stack n = 10 print(f'\nfibonacci({n}) =',fibonacci(n)) print('# of calls to this fibonacci function =',fibonacci.calls) print('Maximum # of calls to this fibonacci function ever in the stack =',fibonacci.max_stack) Note that inside __call__ Python first conditionally prints information about the stack: the name of the function/method call on the stack and its arguments (if it is the decorated function). This code shouldn't print frequently; in the factorial example it prints only when the recursive call reaches the base case, which happens exactly once in every first call to factorial). Then, __call__ increments self.calls and iterates over the stack,counting how many recursive calls to the decorated function are in it: it counts the number of call frames whose f_code attribute is the same as the self.f's __code__ attribute. It updates self.max_stack if this number is a new high, and returns the result of calling the original function object. Note that this code is executed for every recursive call: for factorial it is executed many times, and for fibonacci very many times. So, this code can consume a lot of time, depending on how often the function is called recursively and how deep the traversed stack is for each call. For the code above it prints -------------------------------------------------------------------------------- Dumping stack(top->bottom): shows function name in each call frame: *args= (0,) Top __call__ 1 factorial {'n': 1} 2 __call__ 3 factorial {'n': 2} 4 __call__ 5 factorial {'n': 3} 6 __call__ 7 factorial {'n': 4} 8 __call__ 9 factorial {'n': 5} 10 __call__ Bottom Dumping stack finished -------------------------------------------------------------------------------- factorial(5) = 120 # of calls to this factorial function = 6 Maximum # of calls to this factorial function ever in the stack = 5 fibonacci(10) = 89 # of calls to this fibonacci function = 177 Maximum # of calls to this fibonacci function ever in the stack = 9 Note that if you decorate a non-recursive function and then call it, its __call__ will find no (0) occurrences of the called function on the stack; then it will call the non-recursive function and return its result. So when __call__ counts functions on the stack, it is not counting the function it is about to call, so the Maximum # will be 0. factorial calls itself recursively 6 times (for arguments 5 down to 0); note that when n = 0 __call__ prints and then examines the stack, so it is missing one call to factorial that __call__ calls at the top, which is why Maximum # of call to this factorial... is only 5, even though it is called 6 times. That is, the call of factorial when n = 0 is never counted. And, fibonacci calls itself recursively 177 times (for arguments 10 down to 0: many calls have the same argument); note that __call__ examines the stack, so it is missing one call to fibonacci that __call__ calls at the top, which is why Maximum # of call to this fibonacci... is only 9; it should be 10. That is, the many calls to fibonacci when n = 0 or n = 1 are never counted. ------------------------------------------------------------------------------ Problems: 1) Update the class C (with __setattr__ and __getattribute__) so that the class has an attribute name private, which is bound to a set of attribute names in C (can be both method and non-method attributes). Only methods defined inside class C should be able to access/update these private attributes (even if they are methods). 2) Update the __setattr__ method in your solution to (1) so that it disallows adding new attributes outside methods defined in the class; generalize the class to define the bool can_add_new_attributes_outside_class which controls this feature.