Program 3

Programs that Write Programs: pnamedtuple

ICS-33: Intermediate Programming


Introduction This programming assignment is designed to show how Python functions can define other Python code (in this case, a class): we call exec on a string that represents the description of a Python class, which causes Python to define the class just as if it were written in a file and imported (in which case Python reads the file as a big string and does the same thing). Your code will rely heavily on string formatting operations: I suggest using the str.format method to do the replacements, and now is a good time to learn it if you don't know it; but you are free to use whatever string processing tools you want.

I suggest that you look at the code below and write/test as much of the Point class as you can, directly in Eclipse (especially the methods __getitem__, __eq__, _replace__, Then you need to write the pnamedtuple function, which given the appropriate arguments, would construct a string containing the same information as your Point class, using the .format method to replace generic parts with the actual string needed in the pnamedtuple being defined.

You should download the program3 project folder and use it to create an Eclipse project. Your function (put it in the pcollections.py module) can be tested in a driver I supply. Also examine the miniexample.py module, which performs a similar but simpler task of writing a function; but all the elements needed to write the class appear in a simplfied form here.

You should work on this assignment in pairs, with someone in your lab section. Try to find someone who lives near you, with similar programming skills, and work habits/schedule: e.g., talk about whether you prefer to work mornings, nights, or weekends; what kind of commitment you will make to submit program early. If you believe that it is impossible for you to work with someone, because of some special reason(s), you should send me email stating them and asking for special permission to work alone (which I do grant, but not frequently).

Only one student should submit all parts of the the assignment, but both student's names should appear in the comments at the top of each submitted .py file. It should look something like


# Romeo Montague, Lab 1
# Juliet Capulet, Lab 1
# We certify that we worked cooperatively on this programming
#   assignment, according to the rules for pair programming

Print this document and carefully read it, marking any parts that contain important detailed information that you find (for review before you turn in the files). The code you write should be as compact and elegant as possible, using appropriate Python idioms.


Problem #1: pnamedtuple

Problem Summary:

Write a function named pnamedtuple that is passed information about a named tuple: it returns a reference to class object from which we can construct instances of the specified named tuple. We might use this class as follows:
from pcollections import pnamedtuple
Point = pnamedtuple('Point', 'x y')
p = Point(0,0)
...perform operations on p using methods defined in Point
Please note that although many of the examples in this description use the Point class, your pnamedtuple class must work for all legal calls to this function.

I created six templates (one big, two medium, three small), which are strings that had parts to fill in using the format method; all but the small strings are triple-quoted, multi-line strings, that look like large chunks of Python code (see miniexample.py in the download to help decode this paragraph).

Note calling the following format method on the string

'{name} from {country} tells you {rest}'.format(name='Rich',country='USA',rest='blah..blah..blah')
returns the string result
'Rich from USA tells you blah..blah..blah'
In most cases, the arguments I passed to the format calls came from list comprehensions turned into stings by calling the .join method (the opposite of the .split method). See the miniexample for an example of everything working together to define a function by filling in a template with .format. Finally, my solution is about 100 lines (including blank lines and comments, and solving the extra credit part), and that is divided between Python code (70%) and string templates that specify Python code (30%).

Details

  • Define a function named pnamedtuple in a module named pcollections.py (that is the only name defined in the module). Its header is def pnamedtuple(type_name, field_names, mutable=False): and an example call might be Point = pnamedtuple('Point', ['x','y'], mutable=False). We could then write code like origin = Point(0,0).

    Note that a legal name for types and fields must start with a letter which can be followed by 0 or more letters, digits, or underscore characters (hint: I used a simple regular expression to test this); also it must not be a keyword: the name kwlist is bound to a list of keywords that is available for import from the keyword module. The parameters must have the following structure.

    • type_name must be a legal name.
    • field_names must be a list of legal names, or a string in which spaces separate legal names (or commas and optional spaces separate legal names). So, we can specify field_names like ['x','y'] or 'x y', or 'x, y'. If a name is duplicated, just ignore it (hint: I used the unique generator filter out duplicates).

    If any of the names are not legal, raise a SyntaxError with an appropriate message.

    The resulting class that is written should have the following functionality. Note that the main job of pnamedtuple is to compute a large string that describes the class; we could write out the string to a file and then import the file, defining the class, but that is clumsy.

  • Define the class name to be type_name.

  • Define an __init__ method that has all the field names as parameters and initializes every instance variable (using these same names) with the value bound to its parameter. In addition, define the instance variables _fields and _mutable, which are set to a list of all the field names and the bool parameter respectively. For Point, the __init__ method would be
        def __init__(self, x, y):
            self.x = x
            self.y = y
            self._fields = ['x','y']
            self._mutable = True

  • Define the __repr__ method that returns a string, which when passed to eval returns a newly constructed object that has all the same instance variables and values(==) as the object __repr__ was called on. For Point, if we defined origin = Point(0,0) then calling repr(origin) would return 'Point(x=0,y=0)'. Here is one way to write __repr__ for Point using the format method,
        def __repr__(self):
            return 'Point(x={x},y={y})'.format(x=self.x,y=self.y)
    although there are other ways for it to produce the same resulting string.

  • Define simple accessor methods for each of the field names. Each method name should start as _get_ followed by the name of a field For Point, there would be two accessor methods.
        def _get_x(self):
            return self.x
      
        def _get_y(self):
            return self.y
    Note that with these methods, if we had a list of Point named lp, we could call lp.sort(key= Point._get_x) to sort the list by their x coordinates. Python's builtin namedtuple does not have this ability (which it trades for speed of operation: pnamedtuple is a bit slower).

  • Define the __getitem__ method to overload the [] (indexing operator) for this class: an index of 0 returns the value of the first field name in the field_names list; an index of 1 returns the value of the second field name in the field_names list, etc. Raise an IndexError with an appropriate message if the index is out of bounds.

    Note that this method can be used by Python to iterate through any kind of pnamedtuple one instance variable after another. It is also useful for writing the __eq__ method: see below.

    Hint: use the index parameter, the self._fields, instance variables, the _get methods, and the eval function to write a short solution to this problem; in the case of origin = Point(0,0), calling origin[1] should construct the string 'self._get_y()' and return eval('self._get_y()').

  • Overload the == operator so that it returns True when the two named tuples have all their instances variables bound to equal values. Hint: use __getitem__ and indexing to check for equality.

  • Define a _replace method, which takes **kargs as a parameter (keyword args). This allows the name kargs to be used in the method as a dict of parameter names and their matching argument values. The semantics of the _replace method depends on self._mutable.
    • If True, the instance variables of the object it is called on are changed and the method returns None. So, if origin = Point(0,0) and we call origin._replace(y=5), then print(origin) would display as Point(x=0,y=5) because origin is mutated.

    • If False, it returns a new object of the same class, whose instance variable's values are the same, except for those specified in kargs. So, if origin = Point(0,0) and we call new_origin = origin._replace(y=5), then b>print(origin,new_origin) would display as Point(x=0,y=0) Point(x=0,y=5) because origin is not mutated.

    Define this method to look like

        def _replace(self,**kargs):
            if self._mutable:
                ...
            else:
                ...
    In both ... we iterate (through kargs.items() or self._fields) and refer to self.__dict__ to retrieve the current values of the instance variables: this is a bit tricky. Use our textbook or web resources to learn more about **kargs; feel free to post specific question on the forum not relating to their actual use in _replace and also, not, "Could someone please explain **kargs to me"). The minexample.py module has a little **kargs demo at the bottom.

  • Extra credit: Define the __setattr__ method so after __init__ finishes, if the mutable parameter is False, the named tuple will not allow any instance variables to be changed: it will raise an AttriuteError with an appropriate message.

Of course, our pnamedtuple function should work for Point and and other legal call. The actual namedtuple class in Python is implemented differently, but this programming assignment requires you to use the implemenation above.

Testing

I provided a short driver script whose infinite loop at the end allows you to enter one-line Python statements commands and then see their result (via exc. You and also enter a command to run the batch_test and batch_self_check, providing each with a file that is appropriate for each.

Files for batch_test (see bt.txt for an example) just contain commands that will be executed; many are calls to the print function, which show the result of the print.

Files for batch_self_check (see bsc.txt for an example) contain three parts.

  1. Either a c (for command) or e (for expression) followed by a ->
  2. A command or expression (as specified) followed by a ->
  3. For commands nothing; for expressions a string specifying the expected value for the expression
The batch_self_check will execute the commands and evaluate the expressions: for expressions it checks whether the expression produces the specified result, printing an error mesage for each that doesn't, and tabulating the number of passed and failed tests for printing at the end of the test.

You may find it useful to build real test files for the batch_test or batch_self_check to perform all sorts of checks on your code automatically; I would even be happy if students who did an excellent job constructing these files can create a software marketplace and sell their test files to other students in the class. But, you cannot sell your code! You can sell only these test files. Maybe one of you can even sell your test code to me and the TAs for grading this assignment.

Remember that your pnamedtuple function can print on the console, for debugging purposes, the string it is about to exec so you can look for errors there (just eyeball whether the code correct). The show_listing function (defined in the pnamedtuple function) display a string on the console, numbering its lines (useful when exec finds an error: it reports a line number that show_listing shows).

I have included two programs that David Kay published in ICS-31 that use Python's namedtuple (with those names changed to pnamedtuple).