Lab Assignment C

Informatics 42 • Winter 2012 • David G. Kay • UC Irvine

Lab Assignment C

This assignment is due at the end of lab on Friday, February 17. This is a pair programming assignment; do it with someone you haven't worked with yet this quarter and make sure Joel knows whom you've paired with.

The problem: The assignment, originally written by Alex Thornton, involves building an interpreter for a simple programing language like Basic; Alex calls his language Facile. This may seem a little infrastructural for us, but actually, it's not: Sometimes the right way to solve a problem is to make up a special-purpose language that makes it easy to express the various aspects of the problem (and then build an interpreter to process that language). Even the restaurants program is an anemic example of this; we have a "restaurant collection manipulation language" that consists of half a dozen single-letter commands. In a couple of weeks we'll talk a little more about the way computer scientists look at languages.

You have some advantages over the ICS 22 students who are doing this problem: You're doing it with pair programming, rather than solo; you've spent a week (last quarter) thinking about machine-level programming, so the concepts in this assignment will be familiar; and you're coding in Python, which is higher-level than Java (meaning that it does more of the work for you).

Alex's introduction: When I was a kid, one of my teachers introduced me to a computer for the first time — a Radio Shack TRS-80 Model I (anybody remember it?). Immediately, I was interested. I played little math games and messed around with other "state of the art" educational tools from the early 1980's; as you might imagine, the state of the art wasn't much then, but it was fun.

Then one day, my teacher asked me if I wanted to learn how to write my own programs. I thought it sounded like a great idea. So I picked up a book about a language called BASIC — some of you may have played with it before — and typed in a short program that asked a user for a number of hits and a number of at-bats and printed out a batting average. (Believe it or not, my mother still has a printout of it, including the comment at the top: "My first program, by Alex Thornton." Yes, I commented my first program.) I ran the program, tried it out, and I was mesmerized; the computer did exactly what I asked it to, exactly the way I asked it to. And my lifelong obsession with what I would later know to be computer science began.

BASIC was a good teaching tool for its day: versatile and easy-to-learn. For this project, we've designed a simpler version of BASIC called Facile, which supports only eleven kinds of statements. You'll be building a Facile interpreter, to read and execute Facile programs.

The Facile language: We'll discuss the requirements for your interpreter later in the write-up. First, let's talk about the Facile language. A Facile program is a sequence of statements, one per line. Here's an example of a Facile program:

LET A 3
PRINT A
GOSUB 7
PRINT A
PRINT B
GOTO 10
LET A 4
LET B 6
RETURN
PRINT A
END
.

Each line contains exactly one statement (i.e., there may be no blank lines). Facile assigns a line number to each of the lines, where the first line of the program is numbered 1, the second line is numbered 2, and so on. The last line of the program is a period (.) on a line by itself. Execution of a Facile program always begins at line number 1. There is no predefined limit on the number of lines in a Facile program.

Variables: A Facile program has variables, each named by a sequence of characters that does not include whitespace. Each variable is capable of storing an integer value. The value of a variable may be assigned or changed with a LET statement. A LET statement changes the value of one variable. Some examples are:

LET A 3 — changes the value of the variable A to 3
LET Z –9 — changes the value of the variable Z to –9

You can print the value of a variable to the console by using a PRINT statement. A PRINT statement prints the value of one variable, followed by a newline. So, consider the following short Facile program:

LET A 3
LET Z -9
PRINT A
PRINT Z
.

Its output would be:

3
-9

Execution of a Facile program: A Facile program is executed one line at a time, beginning at line number 1. Ordinarily, execution proceeds forward, so that line 1 will execute first, followed by line 2, followed by line 3, and so on. Execution continues until either an END statement is reached, or until it reaches the "." line that appears at the end of the program.

As in any programming language, it is possible in Facile to write programs that execute out of sequence, though the mechanisms are a bit more primitive than they are in a language like Python. A GOTO statement causes execution to "jump" immediately to the given number. For example, the statement GOTO 4 jumps execution to line 4. Here's an example Facile program that uses GOTO:

LET A 1
GOTO 4
LET A 2
PRINT A
.

In this program, line 1 is executed first, setting the variable A's value to 1. Then the GOTO statement will immediately jump execution of the program to line 4, skipping the second LET. Line 4 prints the value of A, which is 1. So, the output of the program is 1.

A GOTO statement may jump either forward or backward, meaning that the following program is a legal Facile program. See if you can figure out what its output would be.

LET Z 5
LET C 0
GOTO 8
LET C 4
PRINT C
PRINT Z
END
PRINT C
PRINT Z
GOTO 4
.

GOTO statements are not permitted to jump beyond the boundaries of the program, to lines before line 1 or lines after the "." that completes the program. If such a GOTO statement is encountered while a program is executed, the interpreter terminates with an error message.

Mathematical operations: Facile provides the typical mathematical operations that can be performed on variables: addition, subtraction, multiplication, and division. Each operation is provided as a statement that changes the value of the given variable. Here are examples of their use:

LET A 4
ADD A 3
PRINT A
LET B 5
SUB B 3
PRINT B
LET C 6
MULT C 7
PRINT C
LET D 7
DIV D 2
PRINT D
.

In the example above, the ADD statement adds 3 to the value of A, storing the result in A. So, printing A will display 7 on the console. The output of the program above is:

It is important to note that, since all variables in Facile are integers, the DIV statement implements integer division, meaning that its result is the floor (or integral part) of the quotient. So, in the example above, 7 / 2 = 3. [Note: Python has an integer division operator.] The second operand may not be zero, meaning that the statement DIV A 0 is illegal. When a Facile program encounters a division by zero, it immediately terminates with an error message.

The IF statement: Facile provides an IF statement, which acts like a conditional GOTO. It compares the value of some variable to some value, and jumps execution of the program to the given line number if the comparison is true. The comparison can use one of the typical relational operators: <, <=, >, >=, = (equal to), or <> (not equal to).

LET A 3
LET B 5
IF A < 4 THEN 5
PRINT A
PRINT B
.

In the program above, the variables A and B are given the values 3 and 5, respectively. An IF statement then compares A to 4. Since A is less than 4, execution jumps to line 5. B's value is printed out. So this program's output is simply a line containing 5.

The IF statement in Facile is substantially less flexible than its Python equivalent. In an IF statement in Facile, the token IF must be followed by exactly five tokens. The first must be the name of a variable. The second must be one of the relational operators (<, <=, >, >=, =, or <>). The third must be an integer constant. The fourth must be the word THEN. The fifth must be a line number. They behave in the way you might expect. For example: IF C <> 0 THEN 4 means "jump to line 4 if C is not equal to 0".

Like GOTO statements, IF statements are not permitted to jump beyond the boundaries of the program. An attempt to do so should cause the Facile program to terminate with an error message.

Subroutines: There are no methods or functions in Facile, but there is a simplified mechanism called a subroutine. A subroutine is a chunk of Facile code that can be "called" by issuing a GOSUB statement. GOSUB is much like GOTO; it causes execution to jump to a particular line. However, GOSUB also causes the Facile program to remember where it jumped from. Subsequently, when a RETURN statement is reached, execution continues at the line following the GOSUB statement that caused the jump. Here's an example:

LET A 1
GOSUB 6
PRINT A
PRINT B
END
LET A 2
LET B 3
RETURN
.

In the program above, line 1 is executed first, setting the value of A to 1. Next, a GOSUB statement is reached. Execution jumps to line 6, but Facile also remembers that when a RETURN statement is reached, execution should jump back to the line following the GOSUB — in this case, line 3. Line 6 is executed next, setting A to 2, then line 7 sets B to 3. Now we reach a RETURN statement, causing execution to jump back to the line number that we're remembering — line 3. Line 3 prints the value of A (which is 2), then line 4 prints the value of B (which is 3). Next, we reach line 5, which is an END statement, so the program ends.

Subroutines can be used very similarly to Python functions, except they do not take parameters or return a value. Consider the following example, which contains a subroutine that prints the values of A, B, and C each time it's called:

LET A 3
LET B 0
LET C 0
GOSUB 12
LET B 4
GOSUB 12
LET C 5
GOSUB 12
LET A 1
GOSUB 12
END
PRINT A
PRINT B
PRINT C
RETURN
.

Subroutines may call other subroutines, meaning that two or more GOSUB's may be reached before a RETURN is reached. The rules for this are very similar to methods that call other methods in Python; for each GOSUB that is reached, Facile will remember the line to which it should return. When a RETURN is reached, execution will move to the line remembered from the most recent GOSUB. Here's an example:

LET A 1
GOSUB 7
PRINT A
END
LET A 3
RETURN
PRINT A
LET A 2
GOSUB 5
PRINT A
RETURN
.

In this example, execution begins at line 1 by setting the variable A to 1. Next, we jump to line 7 with a GOSUB, remembering that we should jump back to line 3 when we encounter a RETURN. Line 7 prints A (which is 1), then line 8 changes A's value to 2. Now we've reached line 9, which is another GOSUB statement. At this point, execution will jump to line 5, but we'll also need to remember to jump back to the line following this GOSUB — line 10 — when we reach a RETURN. But we also need to remember the line from the previous GOSUB — line 3.

Line 5 sets A to 3, then we encounter our first RETURN statement. We're remembering two lines — line 3 and line 10. But line 10 is the most recently remembered line, so execution jumps to line 10. Line 10 prints A (which is 3). Now we encounter another RETURN statement on line 11. We're remembering the line 3 from the first GOSUB. So execution jumps to line 3, printing A (which is still 3), then ending the program on line 4.

So, the output of this program is:

1
3
3

Like GOTO statements, GOSUB statements are not permitted to jump beyond the boundaries of the program, to lines before line 1 or lines after the "." that completes the program. If such a GOSUB statement is encountered while a program is executed, the interpreter terminates with an error message.

It is also an error for a RETURN statement to be encountered when there has been no previous GOSUB. The Facile program will immediately terminate and print an error message in this case, as well.

Whitespace: While Facile programs may not have blank lines in them, the amount and placement of blank space between the words on each line is considered irrelevant. So, the following is a legal Facile program:

    LET    Zebra  5
 GOTO   7
LET Chimpanzee   4
 PRINT Chimpanzee
PRINT         Zebra
  END
PRINT Chimpanzee
        PRINT  Zebra
    GOTO      3
.

Experimenting with Facile: An interpreter is a program that is capable of executing a program written in some programming language. To give you the ability to experiment, we've implemented a Facile interpreter for Windows already. (For those of you who don't ordinarily use Windows, remember that our machines in the ICS labs run Windows, so you'll have ample opportunity to experiment with Facile. You might even want to "pair program" while you experiment.) This Zip archive contains the interpreter (Facile.exe) and most of the Facile programs that appear in this write-up, along with a few additional ones that demonstrate fatal errors (division by zero, a RETURN statement without a corresponding GOSUB, and a GOTO to a non-existent line). Feel free to write your own, as well. Unzip the archive into one folder, then double-click the program. From there, it's fairly self-explanatory. A word of warning about this interpreter: we wrote it without making a serious attempt at handling syntax problems, so it assumes that the input file is a legal Facile program. If you attempt to run an input file that is not legal Facile, you may see the message "ERROR IN PROGRAM", but it's also possible that my interpreter may simply crash. Moreover, this interpreter's version of Facile has a few subtle differences from the version described here: In particular, it recognizes only the 26 variables A through Z, and it initializes all of them to 0 automatically.

We're providing this interpreter so you can experiment with the language as you have questions about it. Once you're comfortable with it, it'll be your turn to implement a Facile interpreter. (Bear in mind that this Facile interpreter implements much of the optional work described in the "Additional challenges" section below, but it will behave correctly on the samples given in this write-up.)

Your program: For this project, you'll be building your own Facile interpreter, which is a program that is capable of executing a Facile program, generating the correct output according to the specification in the previous sections. Since you're already familiar with Python, you'll write your Facile interpreter in Python. (Since Python runs on many operating systems, that means, once completed, you'll be able to use your interpreter to run Facile programs on Windows, Mac OS X, Linux, Unix, and several other platforms.)

The Facile interpreter that we've provided runs in a (very simple) graphical user interface. Your program, on the other hand, should read one Facile program from an input file, then execute it, writing any output from the Facile program to the console (i.e., using print statements).

As a starting point, we will discuss the organization of this program in class. Of course you may use the code we develop as a basis for your complete solution.

How an interpreter works: A typical interpreter will execute a program one statement at a time, keeping track of what we might call the program state as it goes along. In the case of a Python interpreter, you might imagine that there would be quite a bit of work to be done. The interpreter would need to keep track of all of the objects — creating new ones and garbage-collecting old ones as necessary — as well as maintain the "call stack," along with various other tasks required by Python programs. Implementing an efficient, complete Python interpreter is a project that would easily take many programmer-years.

A Facile interpreter is a much simpler program, since Facile is a much simpler programming language. Your interpreter will need to execute a Facile program one statement at a time, updating the program state as necessary, until either an END statement or the "." is reached. (The "." can simply be treated as an END statement, if you'd like.) The program state consists of the following information:

what line of code is currently executing (you might call this the program counter, which you may remember from the Deus X machine)
the integer value in each of the variables
the call stack; that is, the line numbers remembered because of any GOSUB statements (Since each RETURN jumps back to the line following the most recent GOSUB, it makes sense to store these line numbers in a stack.)

Each statement has a different effect on the program state. For example, a LET statement will cause the value of one of the variables to change, then cause the program counter to be incremented (since, after a LET statement, execution continues on to the next statement), a GOTO statement will cause the program counter to be changed to the line number specified in the statement, and so on.

Reading the program and representing it in memory: Your program will need to begin by reading the Facile program from an input file and representing it in memory. There are a number of ways to solve this problem. One way is to read the program into memory as a collection of strings, with each of the strings containing one line of the input program. Every time a particular line is executed, it would need to be parsed (to see what kind of statement it was), then executed. As you might imagine, this is a terribly inefficient way to implement an interpreter, since the same statement may need to be parsed over and over again. You are not permitted to use this approach for your interpreter.

A better approach — one that we're requiring you to use instead — is to read the input program once, parse it once, and represent it as a list of statement objects. The object-oriented programming concept of inheritance provides a very natural design approach for this problem.

A base class called Statement contains any functionality common to all statements. The only common functionality for all statements is that they can be executed, though what happens when they are executed is different depending on the type of the statement. We can represent this in the Statement class with a method called execute(). A Statement object might also contain a list containing whatever arguments appeared after the keyword.
For each kind of statement (e.g., LET, PRINT, etc.), a subclass of Statement can be designed (e.g., LetStatement, PrintStatement). Each subclass will inherit the field listing the arguments — the information needed to execute the statement. In the case of a LET statement, for example, the necessary information is the name of the variable and the value to assign into it. Also needed in each of these Statement subclasses is an actual implementation of the execute() method.

You'll need code that can parse the input file and create the appropriate sequence of Statement objects, reading the input file and returning a list of Statement objects (actually, Statement subclass objects) containing all the statements in the program. Note that line numbers in Facile start at 1, not 0, so we suggest storing None as the first element in the list, then storing the actual Statement objects with indices beginning at 1. (An alternative, storing the statements beginning at index 0, will require the error-prone practice of adding or subtracting one when converting between line numbers and list indices, which can easily lead to chaos.)

You may assume that the input file contains a syntactically legal Facile program. It's acceptable for your program to either print an error message, ignore lines that aren't understood, or even crash in the event that it's given an input file that is not legal Facile. (It's a good thing Python interpreters don't behave this way.) We will only test your interpreter with syntactically legal Facile programs, though the programs may have run-time errors in them. As was discussed above, there are three kinds of run-time errors: division by zero, a RETURN statement without a corresponding GOSUB, and a GOTO/GOSUB/IF..THEN to a line outside of the boundaries of the program. Your interpreter will need to behave reasonably in these cases, by printing a meaningful error message and terminating gracefully.

Designing your interpreter: As the size of a program increases, one of the most difficult obstacles that programmers face is the need to "separate their concerns." One of the primary strategies that programmers use to separate their concerns is to break a large program into a set of smaller pieces. The obvious mechanism for breaking up a program in an object-oriented language is the use of classes.

Separating concerns is something novice programmers need to learn. The temptation is always to try to think about the complete picture, since this strategy works well for the short programs that you write when you're first starting out. As programs become larger, confusion naturally sets in, as the complete picture can be difficult to keep in your brain all at once. Even moderately small programs can be built out of many classes and encompass a great deal of complexity. This project will encourage you to begin thinking about your programs the same way, which will give you the ability to write much larger programs than you could before.

The main tasks that your program must perform are:

Read the contents of the input file, parsing each line, and storing an object into memory that represents the Facile statement appearing on that line.
Create a representation of the initial program state, then begin executing the program one statement at a time. The execution of each statement will cause the program state to be changed, and may also cause output to appear on the console.
Continue executing the program until an END statement or the "." is reached.

We suggest breaking up your program in the following way:

Main program. This will oversee the execution of the interpreter on one input file. Interpretation requires following the sequence of steps above: parsing the input file, creating an initial program state, then executing one statement at a time until the program ends. Most of the actual work is delegated to other classes, with Interpreter acting as a manager.
Parser. This parses the input file and returns a list of objects representing statements.
CallStack. A generic stack (which you can implement easily with a list). You'll use this to store the return points from GOSUB statements.
ProgramState. This represents the state of an executing Facile program. It contains the program counter, the values in each of the variables, and the call stack.
Statement. This class represents a Facile statement. Subclasses such as LetStatement, PrintStatement, etc., implement the actual statements.

It's a good idea to build as many of the underlying pieces as you need to implement a couple of the statements, say LET and PRINT, first. Afterwards, add new kinds of statements one or two at a time, making any changes required in the underlying pieces.

Facile quick reference: Here is a list of all of the Facile statements that should be supported by your interpreter, with a brief description of the effect of each. In each of the statements below, var may be the name of a variable, int may be an integer constant (e.g., 1, –3, 15), and linenum may be a line number (1 or greater).

Statement	Description
LET var int	Changes the value of the variable var to the integer int.
PRINT var	Prints the value of the variable var to the console.
ADD var int	Adds int to the value of the variable var.
SUB var int	Subtracts int from the value of the variable var.
MULT var int	Multiplies the value of the variable var by the integer int.
DIV var int	Divides the value of the variable var by the integer int.
GOTO linenum	Jumps execution of the program to the line numbered linenum.
IF var op int THEN linenum	Compares the value of the variable var to the integer int using the relational operator op (=, <>, <, <=, >, >=). If the comparison is true, jumps execution of the program to the line numbered linenum. If not, this statement has no effect.
GOSUB linenum	Temporarily jumps to the line numbered linenum. A RETURN statement will cause execution to jump back to the line following the GOSUB.
RETURN	Jumps execution of the program back to the line following the most recently-executed GOSUB statement.
END	Ends the program immediately.
.	Special marker that indicates the end of the program text. Behaves as an END statement when encountered.

Additional challenges: Your Facile interpreter may implement some additional features; these are not required.Here are two additional statements:

Statement	Description
INC var	Adds 1 to the value of the variable var. For example, the statement INC A adds one to the value of A.
DEC var	Subtracts 1 from the value of the variable var. For example, the statement DEC A subtracts one from the value of A.

Including these statements in Facile does not dramatically increase its power, but it does allow for convenient incrementing and decrementing, which can be handy for constructing simple "loops."

Another improvement to the interpreter can increase the expressiveness of the language quite a bit: Consider a statement such as LET. As defined above, the LET statement sets the value of some variable to some integer constant. But imagine that you wanted to set the value of some variable to be equal to the value of some other variable. Facile, as defined above, does not allow this fundamental operation. But there's no reason it couldn't.

In many places where an integer constant may normally appear in a Facile program, your interpreter could also allow the name of a variable to appear. In the case of PRINT, you could also allow an integer constant instead of a variable name. So, for example, these statements may be given to the interpreter:

LET A B — Sets the value of A to be equal to the value of B.
PRINT 3 — Prints the integer constant 3 to the console.
ADD B C — Adds the value of C to the value of B, storing the result in B.
SUB B C, MULT B C, DIV B C — similar to the ADD statement above
IF A <= B THEN 4 — Jumps to line 4 if A is less than or equal to B.
IF 3 <= B THEN 4 — Jumps to line 4 if 3 is less than or equal to B.
IF 4 <= 9 THEN 4 — Jumps to line 4 is 4 is less than or equal to 9.

You might also consider designing and implementing some new statements to accomplish some of these important goals, or others of your own choosing:

Allow Facile programmers to put comments into their code. (This could bring up an interesting question about the design of the language: how should line numbers be counted if not all lines contain code?)
Define additional variables that can store string values instead of integers. The BASIC language names such variables with trailing $ characters. So you might have the variables A$, B$, C$, ..., each of which is capable of storing a string.
Add a statement, or perhaps a variant of the PRINT statement, to output a string of text (a string literal or the value of a string variable) to the console.
Allow the IF statement to compare two string variables, or to compare a string variable to a string literal.
Add a statement to read an integer and/or string from the console and store it in a variable.
Add a LABEL statement that takes a variable name whose value will be the next line number in the program. Then that variable could be used as the target for a GOTO or IF statement, saving you from having to count lines. (Even though it occurs in your code, LABEL isn't an executable statement at all; your parser needs to handle it once, when parsing.)

Finally, here's one more sample program. You can run it in Alex's Facile interpreter; you can read it to see what it does; you can use it to test your own interpreter (but it does require that you implement one of the optional features—two if you count comments).

* MY FIRST FACILE PROGRAM BY DAVID KAY
LET N 5
LET F 1
GOSUB 6
PRINT F
END
IF N > 1 THEN 8
RETURN
MULT F N
SUB N 1
GOSUB 6
RETURN
.

When you're done:

Submit all your Python source code in one .py file via Checkmate. Each pair should submit just one solution with both partners' names clearly indicated.

The usual grading criteria for lab assignments apply.

Fill out a partner evaluation at EEE.

Modified slightly to reflect Python by David G. Kay, Winter 2012. Testing section added back into the assignment by Alex Thornton, Winter 2007. Clarification to the GOTO/GOSUB error condition, specifying that the THEN part of IF statements is to be included, made by Alex Thornton, along with necessary changes A new error condition (GOTO/GOSUB to non-existent line) added by Alex Thornton, along with necessary changes to provided interpreter, Winter 2006. A couple of very minor changes made by Alex Thornton, Fall 2005. Some minor changes introduced by Alex Thornton, Spring 2005. Assignment restructured (to require the use of an object-oriented solution, while no longer requiring the testing portion) by Alex Thornton, Fall 2004. Almost all of the work that was previously optional is now required, along with other heavy modifications, as well as improvements to the given interpreter by Alex Thornton, Winter 2003. Originally written by Alex Thornton, Fall 2002.