Python Classes When we define a class in a Python, we are binding the class name to an object representing the class. Remember, all names in Python refer to objects; so defining class C: ... creates the name C and binds it to an object representing the the class. What names are attributes defined in a class object's namespace? I'm not talking about the instance objects we will create from the class C, but the names in the namespace of the class object itself. Mostly, a class defines names that are bound to it methods (__init__, etc). When we want to create an object that is an instance of a class, we refer to the class object and follow the reference with (). Python does two things: (1) It calls special function that creates an object (that is the instance of the class we are constructing. Note that this object automatically has an dictionary associated with it by the name of __dict__. (2) It calls the __init__ method for the class, passing the object created in (1) to the first/self parameter of __init__, and following this with all the other values used in the call to construct this instance. (3) A reference to the object that was created in (1) and initialized in (2) is returned as the result of constructing the object. So if we call C(1,2) Python calls __init__(new object,1,2) Note that we can define other names that can share the same class object. class C: def __init__(self): print('C object created') D = C # C and D share the same class object x = C() # Use C to create an instance of the class object y = D() # Use D to create an instance of the same class object print(type(x), type(y), type(C), type(type(x))) Running this script produces C object created C object created We can use the type function to determine the type of any object. The objects x and y refer to are instances of the class C, defined in the main script. C (and all classes we define) are instances a special class called 'type'). So x is an instance of C and C is an instance of 'type'. ------------------------------------------------------------------------------ Manipulating an objects namespace (and __init__): All objects have namespaces, which are dictionaries in which names are associated with their values. Typically we write something like self.name = value But we can illustrate a cruder way to add names to the namespace of an object. This is not the recommended way to do this, but it is a starting point to understanding objects. Given class C defined above, we can write x = C() print(x.__dict__) x.x = 1 x.y = 'two' x.__dict__['z'] = 3 print(x.__dict__) Running this script produces C object created {} {'z': 3, 'x': 1, 'y': 'two'} So we have used two different forms for adding three names to the namespace/dictionary for the object x refers to. Now generally we don't initialize the namespace of an object this way; instead we use the __init__ method and its self parameter it receives to do the initialization. But really, the same thing is happening in __init__ as was shown above. class C: def __init__(self,init_x,init_y): print('C object created') self.x = init_x self.y = init_y self.z = 3 x = C(1,'two') print(x.__dict__) Running this script produces the same results. C object created {'z': 3, 'y': 'two', 'x': 1} The purpose of the __init__ method is to create all the names needed by the object's methods and initialize them. Since every object constructed is likely to need the same names, the __init__ method, which is automatically called by Python, is just the place it provides to do this. So, for every assignment statement self.name = val Python puts an entry into the object's namespace (self.__dict_) with the key 'name' and the value val. We can even do this ourself explicitly, as shown above. When self.name appears in an expression (e.g., a = self.name), Python substitutes self.__dict__['name'] to retrive the value of that name. Note that some names always receive the same initialization, so we don't need a parameter to intialze it. But some names need to be initialized to different values as different times, so typically we add a parameter to __init__ to allow us to specify how that name should be initialzed. Also, sometimes __init__ will verify that a parameter has received a legal and reasonable value, and raise an exception to indicate that the object being constructed cannot be initialized properly. Sometimes it raises an exception explicitly, using an if statement that tests for an illegal value. Sometimes it usess assert for this value. Remember the form of assert is: assert boolean-test, string which is equivalent to the slightly more verbose if not(boolean-test) raise AssertionError(string) Once an object is constructed and initialized, typically we use it by calling its methods (or passing it to another function that calls its methods). The methods are all passed the object in the self parameter, so self.name in methods is exactly as described above. Before finishing the discussion of objects and their dictionaries, recall that C refers to an class object; it is an instance of the type object. As a object it also has a __dict__ that stores its namespace. Here is some code that shows what names are in the C object. for (k,v) in C.__dict__.items(): print(k,'->',v) And here is what it prints. __dict__ -> __module__ -> __main__ __weakref__ -> __doc__ -> None __init__ -> Note that __init__ is the only function we define, and it is there. Because I am running this as the main module, its __module__ variable is '__main__' which you should know something about, because you have been writing if __module__ == '__main__' If this module were imported, the __module__ would be associated with the file name the module was written in, when imported. Only the module corresponding to the script that started execution has ist __module__ variable bound to '__main__' ------------------------------------------------------------------------------ Different kinds of variable names: defintion and use Let's discuss four different kinds of names in relation to classes (we will call them all variables): (1) local variables: defined and used inside functions/methods parameter variables are a subset of these (2) instance variables of objects: typically defined inside __init__ and used inside class methods (of course we saw other ways to define them above). (3) class variables: typically defined in classes (at same level as methods, but not inside methods) and typically used in methods (4) global variables typically defined in modules (outside of classes and functions) and used inside functions and methods You should know how to use all these kinds of variables (their semantics). Use local variables and instance variables as needed (and most function/methods have the former and most classes define the later in __init__ and use them in methods). Class variables are sometimes useful to solve certain kinds of problem where common information is stored among all the instances by storing it once in their class object. Global variables are often frowned on: they too have their elegant uses, but infrequently. The following script uses each each global_var = 0 class C: class_var = 0 def __init__(self,init_instance_var): self.instance_var = init_instance_var def bump(self,name): print(name,'bumped') #global_var = 100 # comment out this line or the next global global_var # comment out this line or the previous global_var += 1 C.class_var += 1 self.instance_var += 1 def report(self,var): print('instance referred to by ', var, ': global_var =', global_var, '/class_var =', C.class_var, '/instance_var=', self.instance_var) x=C(10) x.report('x') x.bump('x') x.report('x') y = x print('y = x') y.bump('y') x.report('x') y.report('y') print('y = C(20)') y=C(20) y.bump('y') y.report('y') C.report(x,'x') # same as x.report('x') by the Fundamental Equation of OOP type(x).report(x,'x') # ditto: the real Fundamental Equation of OOP print(C.class_var, x.class_var) # discussed at the bottom print(x.instance_var) Running this script produces the following results. instance referred to by x : global_var = 0 /class_var = 0 /instance_var= 10 x bumped instance referred to by x : global_var = 1 /class_var = 1 /instance_var= 11 y = x y bumped instance referred to by x : global_var = 2 /class_var = 2 /instance_var= 12 instance referred to by y : global_var = 2 /class_var = 2 /instance_var= 12 y = C(20) y bumped instance referred to by y : global_var = 3 /class_var = 3 /instance_var= 21 instance referred to by x : global_var = 3 /class_var = 3 /instance_var= 12 instance referred to by x : global_var = 3 /class_var = 3 /instance_var= 12 3 3 12 So, the global variable is changing every time, as is the class variable, because there is just one of each. But each object has its own instance variables. If we instead commented as follows global_var = 100 # comment out this line or the next #global global_var # comment out this line or the previous running the script would have the following result. Note that by removing global global_var, then the statement global_var = 100 defines a variable local to the bump method -despite its name- so its increment does not affect the true global_var, which stays at zero. Note that one can refer to global_var inside class C, but if a method wants to change the global_var it must declare it global (then all reference and changes are to the real global variables); otherwise the assignment statement global_var = 100 create a new name local to the methods and binds it to 100. instance referred to by x : global_var = 0 /class_var = 0 /instance_var= 10 x bumped instance referred to by x : global_var = 0 /class_var = 1 /instance_var= 11 y = x y bumped instance referred to by x : global_var = 0 /class_var = 2 /instance_var= 12 instance referred to by y : global_var = 0 /class_var = 2 /instance_var= 12 y = C(20) y bumped instance referred to by y : global_var = 0 /class_var = 3 /instance_var= 21 instance referred to by x : global_var = 0 /class_var = 3 /instance_var= 12 1 1 11 Finally, it is clear what C.class_var and x.instance_var do, but what about x.class_var? As shown above (see the # discussed at bottom) this prints 1 just as C.class_var does. When we study inheritance, we will learn more about how Python searches for attributes: if not in an instance, then in the class they were constructed from, and if not in that class in one of their super classes. ------------------------------------------------------------------------------ Defining a method for a class after the class has been declared: One more interesting thing a dynamic language like Python can do. Let's go back to a very simple class C, that stores a variable, but cannot change it. The report method does print it. class C: def __init__(self,init_instance_var): self.instance_var = init_instance_var def report(self,var): print('instance referred to by ', var, '/instance_var=', self.instance_var) Now look at the following code. It defines x to refer to an object constructed from the class c, which has only a report method (and then it calls that method to report). Next it defines the bump function of one parameter named self: its body increments the instance_var in self's namespace dictionary. We call bump with x and it updates x's instance_var (as seen by the report). The we do something pretty neat. We add the bump function into the namespace of C's class object with the name cbump (that is just the same as writing C.__dict__['cbump'] = bump. We could have used just bump, but instead used a slightly different name. Then, we call x.cbump('x') which by the Fundamental Equation of OOP is the same as calling C.cbump(x,'x') where we just attached the cbump method to the object representing class C. x=C(10) x.report('x') def bump(self,name): print(name,'bumped') self.instance_var += 1 bump(x,'x') x.report('x') C.cbump = bump; x.cbump('x') x.report('x') So on the fly, we attach a method to the class C and then can call it via any object that has already been (or will be) constructed from the class C. It produces the following result. instance referred to by x /instance_var= 10 x bumped instance referred to by x /instance_var= 11 x bumped instance referred to by x /instance_var= 12 Note that the del command gets rid of an association in a dict. I could call del C.cbump and if I then called x.cbump('x') the result would be Traceback (most recent call last): File "C:\Users\Pattis\workspace\folder\module.py", line 26, in instance referred to by x /instance_var= 12 x.cbump('x') AttributeError: 'C' object has no attribute 'cbump' So we can both add and remove names from an object's namespace. Note that we could have defined written def bump(p,name): print(name,'bumped') p.instance_var += 1 and nothing changes, so long as every occurence of self is changed to bump. We could likewise write def report(p,var): ... inside the C class itself. The parameter name self is just the standard name to be used, but there is nothing magical about it, and we can subsitute whatever name we want. ------------------------------ In fact, there is an even a way to to add a reference to the bump method as to a single object constructed from class C, not its class (and therefore not to all the other instances of that class). Start with the class C as defined above, defininig just __init__ and report. Then def bump(self,name): print(name,'bumped') self.instance_var += 1 x = C(0) y = C(100) x.bump = bump; x.bump(x,'x') x.bump(x,'x') x.report('x') y.bump(y,'y') # fails because there is no bump in the object y refers to y.report('y') Note that calling x.bump(..) finds the bump method in x's namespace, without translating it useing the Fundamental Equation of Object-Oriented Programming. So we need to explicitly pass the x argument which becomes bump's self parameter. When run, this script produces: x bumped x bumped instance referred to by x /instance_var= 2 Traceback (most recent call last): File "C:\Users\pattis\Desktop\python32\test\Test.py", line 21, in y.bump(y,'y') # fails because there is no bump in the object y refers to AttributeError: 'C' object has no attribute 'bump' So we can add methods to already written classes: method that all the instances can refer to; and we can add methods to single instances of the class, such that only those instances can call the method (and their self parameter must be passed an explicit argument). ------------------------------------------------------------------------------ Redefinition Note that we can redefine a function or class: we can write the following def f(): return 0 def f(): return 1 and calling f() would return 1. Eclipse gets upset about this, and marks the second definition as an error, but the script will run. We can also do the following, which Eclipse likes and runs the same. def f(): return 0 def g(): return 1 f = g So calling f() returns 1. Again, def just makes a name refer to an object, just like writing a = 1 and then a = 2 (making a refer to a different object). We can do the same thing for classes. ------------------------------------------------------------------------------ Accessor/Mutator Method (or query/command methods) If a method, when called, returns information about an object's state but does not change its state, it is called a pure accessor (or query). If a method changes its object's state and returns None (all functions/method return something), it is called a pure mutator (or command). Some methods do both; a method that just prints information might be described as doing neither (although there is a state change occuring, in the console). A design question arises. Suppose that we know that an object o of a class C has an instance variable name iv: should we directly refer to o.iv? Should we change it by writing o.iv = ....? The high road says, no: classes hide the actual instance variables from clients of the class, and provide methods to fully manipulate the objects. The instance variables used to implement a class might change from day to day, but the methods that define a class should always work correctly. Python is a bit at odds with this philosophy. Some languages (Java/C++) have a mechanism whereby instance variables can be tagged (private) so that they cannot be accessed from a any method not defined in the class itself; these languages do not allow new methods to be added to classes as Python does (and we showed above). Python's philosophy is a bit more open to using instance variables outside of the class methods. But there is danger there and beginners often use this convenience and end up taking longer to get code to work correctly and making it harder to change code correctly. In fact, Python instance variable names often being with one or two underscores (but don't have two trailing underscores, so are unlike __init__ that means something different). When a programmer uses a single underscore in a name in a class (for data or a method), he/she is saying that no one should access that name. But, there is nothing in the Python interpreter that stops that from happening. class C: def __init__(self): self._mv = 1 def _f(self): return self._mv x = C() print(x._mv, x._f()) When run, this script produces: 1 1 If a Python name in a class begins with two underscores, it can be referred to that name in the class, but not outside the class: but it can still be referred to outside the class, but with a "mangled" name that includes the name of the class. If a class C defines a name __mv then the name outside the class is _C__mv. This is called a "mangled" name. So, look at the following class. Inside there is an instance variable __mv and a method __f (which refers to the name __mv in the standard way. Python would complain by raising an exception for the first value in the print AttributeError: 'C' object has no attribute '_mv' But we could execute the following code x = C() print(x._C__mv, print(x._C__f()) by writing the mangled names ourself. When run this script produces: 1 1 In fact, if we wrote print(x.__dict__) it would print: {'_C__mv': 1} When we discuss operator overloading and inheritance we will learn more about controlling access to information defined inside classes. ------------------------------------------------------------------------------ Defining classes in unconventional places: We normally define classes in modules, and often a module defines just one class (although some define related classes). Other modules define lots of functions. Such modules are called library modules because they don't run themselves, but we import them to gain access to the names they define. We can declare a class in a function and call the function to return an object constructed from the class (and even use the returned object if we know its defined instance variables or methods). def f(x): class C: def __init__(self,x): self.val = x def get (self) : return self.val return C(x) y = f(1) print(y, y.val, y.get()) When run, this script prints the following. <__main__.f..C object at 0x02829170> 1 1 The first value indicates that class C is defined local to function f, which is in the script run We can also declare a class in a class and call some method in the class to return an object constructed from the inner class (and even use the object if we know its defined instance variables or methods. class C: def __init__(self,x,y): self.x = x self.y = y class Cinner(): def __init__(self,x): self.val = x def get (self) : return self.val def get(self): return (self.x, self.y) def x_gen(self): return C.Cinner(self.x) def y_gen(self): return C.Cinner(self.y) z = C(1,2) a = z.x_gen() b = z.y_gen() print( z, a) print(z.get(), a.get(), b.get()) When run this script prints: <__main__.C object at 0x027BB630> <__main__.C.Cinner object at 0x027BBF10> (1, 2) 1 2 z.x_gen() returns a reference to a class C.Cinner, that was initialized by the x parameter in the __init__ method. When we call a.get(), the get method defined in Cinner() returns the value it was initialized with. ------------------------------------------------------------------------------ Look at my Dice class (in dice.py in the courselib). I has many interesting simple class features; it doesn't use many of the features discussed here. Soon we will see how to use methods to overload operators that allow us to take advanage of Python's synax. We will learn more about functions like __init__ with double underscores front and back. Question: How can we use a class variable to keep track of how many objects are created from that class (remember that each object creation calls __init__) That is, for a class C a = C(...) b = C(...) print(C.instance_count) is 2