Informatics 45 Spring 2010
Project #2: Pictures of You

Project Plan due: Friday, April 23, 9:00pm
Program due: Monday, May 10, 9:00pm
Lessons Learned due: Wednesday, May 12, 9:00pm

You have your option of working on this with a partner using the "pair programming" technique or working individually


Introduction

Being able to write programs that can read to and write from files stored locally (say, on a hard drive or an attached USB drive) pushes out the boundaries of what your program can accomplish. But being able to write programs that can connect to other programs via the Internet opens up an amazing variety of possibilities: writing a program that (1) you can run on more than one machine and connect them to one another; (2) that can download data from the web; (3) that can integrate Google searches into their output; (4) that can combine information from multiple sources in useful ways; and many other possibilities.

This project begins our exploration of connecting our programs to the broader world around them, by asking you to implement a program that can connect to other programs via the Internet and share data. You'll learn about some of the underlying technology that makes it possible to write these kinds of programs, as well as some of the challenges that arise. This experience will, in turn, provide you with some exciting abilities, ones that we will use again in this course and that you will no doubt see again.


Choosing a partner (optional)

You have the option of working with on this project with a partner, using the "pair programming" technique, or alone. Read through this project write-up on your own, then decide whether you'd like to work with a partner.

It's important to form your partnership early if you plan to have a partner, or to commit yourself to working alone if you plan to go that route. For that reason, we're requiring you to choose your partner and notify your TA during the lab meeting on Friday, April 16. We will be assuming that anyone who does not notify the TA of his or her partnership by the end of the April 16 lab section plans to work alone. There will be no exceptions to this!

If you're interesting in partnering but are having trouble finding a partner, notify your TA, so that you can be assisted in finding one.


The problem

This project asks you to build a program that acts as a "picture frame," capable of displaying one image in a GUI window. The image can be in any format supported by Java automatically — these formats include well-known formats like JPEG, GIF, and PNG. If the image is larger than the size of the window, scrollbars should be provided; it is not necessary to scale the image to fit the window, though you can implement that as an option if you'd like. You're also required to use a menu to provide access to your program's various commands (look up JMenuBar, JMenu, and JMenuItem in the Java documentation for details) and a simple way to browse for image files on your computer (the JFileChooser class in the Java library provides a built-in solution to this problem). The GUI design details are otherwise up to you.

There is one interesting twist, though. In addition to being able to display a picture, your program also has to be able to do two other things:

Note that this requirement means that your programs will have to be interoperable; even though you're writing your programs separately, they'll have to know how to communicate with each other in an agreed-upon way. I've outlined that agreement, called a protocol, later in this write-up and your program is required to support it.

A couple of other important details:


Some technical information about the Internet

Writing programs that can communicate via the Internet requires some knowledge of how the Internet works. The Internet is a complex, many-layered combination of hardware and software, but you actually need to know surprisingly little about the underlying technology in order to write programs that use it. Still, there are issues that you will need to be aware of, especially if you want to do some or all of your work on your own computer.

In general, we will only support your work on the machines in the ICS labs, as they have been configured so that they will allow you to work unimpeded. I am providing a couple of pieces of advice here if you want to work on your own computer, though it should be pointed out that we cannot realistically support each of you with the various configuration issues that you might have. Effectively, when you're not using one of the ICS lab machines, you're on your own.

Loads of useful information is available online about all of the topics summarized below; few of the problems I discuss are insurmountable. (Of course, the challenge, as always, is to separate the useful information from the noise.) But all of these issues will have an effect on whether you can get your program to connect to other students' programs and send pictures to it, even if your program is completely correct, so I'd like you to be aware of them before you get started.

IP addresses

In general, every machine connected to the Internet has an IP address. An IP address is akin to a telephone number; by specifying that a message is to be sent to a particular IP address, the network will be able to determine who should receive the message and how the message should get there, hiding these details from the machines on either end.

An IP address is generally displayed as a sequence of four numbers separated by dots; each of the numbers has a value in the range 0-255 (a range chosen because values in this range can be stored in eight bits, or one byte). For example, as of this writing, the IP address of one of the machines that acts as a "server" for the ICS web site has the address 128.195.1.76.

If you want your program to offer a picture to another student's program, you'll have to know the IP address of that student's machine. There are a few ways to find out your own IP address; one the simplest is to go to one of many web sites, such as whatismyip.org, that will tell you what your address is. If you want to find someone else's IP address, that's not always as simple; for the purposes of this project, the best thing to do will probably be to ask the other person what his or her address is.

The "loopback" address

There is a special a range of addresses that can always be used to connect a computer to itself, regardless of what its IP address is. These are called "loopback" addresses, the most common of which is 127.0.0.1. So if you want to test your program by sending pictures from one instance of your program to another on the same computer, you can use 127.0.0.1 to do that.

Ports

When you want to connect a program to another program running on another machine, it's not enough to know the IP address of the other machine. Multiple programs on the same machine are likely to be connected to the Internet at any given time. So there needs to be a way to identify not only what machine you'd like to connect to, but also which program on that machine you'd like to connect to.

The mechanism used for this on the Internet is called a port. A program that listens for another program to connect to it will register its interest in a particular port by binding to it; ports have numbers that range from 0-65535 (a range chosen because values in this range can be stored in sixteen bits, or two bytes). Only one program can be bound to a port on a given machine at any given time.

It's generally a good idea not to use ports with numbers below 1024, because these tend to be reserved for common uses (e.g., web traffic, FTP traffic). Beyond that, you may discover some ports at or above 1024 in use — depending on what programs are running on your machine — but most should be available.

The important thing to realize here is this: In order to connect your program to another student's program, you'll need to know not only what their IP address is, but also what port their program is listening on.

The Domain Name System and DNS lookups

Though every machine connected to the Internet has an IP address, users don't typically use IP addresses on an everyday basis. Just as IP addresses are akin to telephone numbers, there is an Internet service called the Domain Name System (DNS) that acts as a phone book; given the name of a computer, it can tell you its IP address.

So, for example, when you brought up this web page in your browser, your browser first had to know the address of www.ics.uci.edu; it found this out by doing a DNS lookup, by sending a message to a Domain Name Server and asking "What is the IP address of www.ics.uci.edu?" In return, the browser received a message that said "It's 128.195.1.76," at which time your browser could connect to that address and download this web page.

DNS is unlikely to affect your work on this assignment, since you're best off just using IP addresses to connect to other students' programs, but it's worth knowing about this in the context of the work that we're doing (and will be doing as we move forward this quarter).

Note that the "loopback" address has its own name: localhost. The name localhost always resolves to a "loopback" address.

Firewalls

The machines in the ICS labs have been set up specifically to allow them to access one another on any port (though machines outside of the ICS network may not be able to contact them). This means, once your program is finished, that you will be able to run your program in the ICS labs and send pictures to other students running their programs there. Outside of the lab, though, you may run into some difficulty in getting your program to work, not because your program is incorrect, but because the open environment provided by the ICS labs is no longer the norm.

The Internet offers a certain amount of anonymity — it's hard to know who's contacting you or what their motives are if all you can see is their IP address and what port they're connecting to. This kind of anonymity has its benefits, though it also has its serious downsides; when you can't know who's contacting you and can't know what they're trying to accomplish, and when you can't always trust your operating system and other software not to provide outsiders access to information they shouldn't have, the wise solution is to restrict incoming traffic. The theory is that if no one can connect to you, no one can take advantage of you (without you having "asked for it," in some sense, by connecting to them). This is the theory behind firewalls, which are software or hardware that restrict other computers' access to computers behind them.

It was once the case that firewalls were mostly used in businesses, as they were the primary targets of online crime and mischief. Nowadays, though, many computers come with firewall software built into them — Windows XP and Vista, for example, ship with firewall software as a standard, enabled feature. This may make it more difficult for other students' programs to connect to yours, because your computer may be configured to disallow incoming connections. Some firewall software also allows you to disallow certain kinds of outgoing connections, which might also affect your ability to connect to programs running on other machines. There are usually ways to "open a port," which means you've told your firewall to allow traffic on a certain port to move into or out of your machine, while traffic on other ports will still be forbidden. Details of how you do this vary from one context to another, but there is a fair amount of documentation online if you want to learn how to open ports using your particular combination of hardware and/or software firewalls.

I should point out, also, that some Internet service providers have their own firewalls and traffic limitations in place, so if you're working from home, your experience — especially in terms of being able to have others connect to you — may vary considerably depending on your provider.

Routers and network address translation (NAT)

In general, every machine connected to the Internet has an IP address. However, many of us are not connected directly to the Internet at all. For example, I have several computers in my home, but I have only one Internet connection: a cable modem. In order to use more than one of my computers at a time online, it's necessary for me to have some way of sharing that connection.

In order to do that, I do what most people do in this situation: I use a device called a router. The router is connected to my one Internet connection. Whenever it's connected, it has an IP address. (Most home Internet users, me included, don't get the same IP address every time they connect, though, which is one reason why it's often hard to run a server from your home.) My computers don't connect directly to the Internet at all; instead, they connect to the router. The router's job is to forward outgoing traffic from each computer to the single Internet connection, and to take the incoming traffic and route it to the appropriate computer.

The router and my computers form their own local-area network, or LAN. The router assigns a "fake" IP address to each of my computers, using a range of addresses that is never assigned to computers on the Internet. As traffic flows into and out of the router, it performs a task called network address translation, or NAT, which means that it converts the internal, "fake" IP addresses to its own IP address for traffic going out, and converts its own IP address back to the "fake" IP addresses on the way back in. As far as the outside world is concerned, I don't have many computers; I just have one: the router. Many routers also act as firewalls, disallowing incoming traffic in most cases.

Why this will affect you when you work from home on this project is that your router will make it much more difficult for other students' programs to connect to yours. They'll need to have your router's IP address, not your computer's IP address. (That's not hard to give them, since web sites like whatismyip.org will show you the router's IP address, since that's the only address that outside world only ever sees.) You'll also need to configure your router to allow incoming traffic on at least one port, and to send incoming traffic on those ports to a particular one of your computers. Details of how you set this up vary considerably from one router to another, but are generally available online if you know what model of router you have.

Wait... I'm getting overwhelmed!

Don't worry! I present these details as useful background information, though they're not the core focus of this project. (They are details that are worth knowing if you want to be a part of the technology field, though.) Other than the use of IP addresses and ports, none of these details is likely to affect your work all that much so long as you use the computers in the ICS labs.


The Informatics 45 Picture Frame Protocol (I45PFP)

What is a protocol?

Though each of you will be writing a completely separate program, your programs must be able to send and receive pictures between one another via the Internet. That requires us to agree on a single mechanism for doing that, so that each program will know precisely how to send the picture and, in turn, precisely how to receive pictures from others.

Part of our agreement is that we'll use a standard abstraction for Internet communication: sockets. A socket is an object that hides the underlying details of a network connection. Though the underlying network technology is complex, though information is actually sent across the Internet by breaking it up into small pieces and sending those pieces out into the network separately (so that they may arrive at their destination in a different order than they were sent, and so that some parts of it may not arrive at all and will have to be re-sent), a socket hides all of this and makes the connection appear, to your program, to consist of two streams, an input stream and an output stream. Data placed into the output stream of one program's socket will arrive in the same order in the input stream of the other's. It is important to realize that networks are unreliable; there's no guarantee that the data you send will ever get to the recipient, but you can be guaranteed that, if it does, it will be placed into the input stream of the recipient's socket in the same order that you sent it.

Using sockets is not enough, though. Any time you want programs to be able to communicate via the Internet, there needs to be a protocol, which is a set of rules governing what each party will send and receive, and when they will do it. You can think of a protocol like a very rigidly-defined kind of conversation, with each participant knowing its role, so that it will know what to say and what to expect the other participant to say at any given time.

Many protocols have been defined that govern how various programs send and receive information via the Internet. For example, the Hypertext Transfer Protocol (HTTP) is what your browser uses to connect to a web server, request a web page, and receive a response. (That protocol is defined in all of its detail at this link. It has nothing to do with this project, but if you're curious how a "real" network protocol is defined, look no further.) Since all browsers and all web servers conform to the same HTTP protocol, they can interoperate, even though they are written by different groups of people, run on different operating systems, and provide different user interfaces.

For this project, we'll need a protocol. Our protocol is called the Informatics 45 Picture Frame Protocol; since technical people are so fond of acronyms, we'll use an acronym, too: I45PFP.

The definition of I45PFP

I45PFP conversations are relatively short: they last only long enough for the sender to send one picture to the receiver, then they're terminated. If you want to send another picture to the same receiver, another connection will have to be made and another I45PFP conversation will need to take place.

I45PFP conversations are between two parties, which we'll call the sender and the receiver. I45PFP conversations proceed in the following sequence:

So, a sample conversation looks like this:

Sender Receiver
I45PFP_HELLO
GOAHEAD
IMAGE 10240 Boo with underbite
ACCEPT
(All 10240 bytes of the image)
GOODBYE

How you should handle erroneous socket input

Your program is not permitted to assume that all input will be correct. When it receives input that does not conform to the protocol, your program must immediately close the connection. (This is a rudimentary, but nonetheless effective, form of security: if someone connects and won't play by the rules, hang up on them.)

Your program should also gracefully handle the situation where the file sent to it is not an image that can be displayed.

A note about the design of I45PFP

You may wonder why the first message is more cryptic than the others; the first message is I45PFP_HELLO instead of just HELLO, while the others are regular English words. Just like it's important that file formats contain enough information to make it clear what format the file is in — for example, we saw in lecture that the JPEG image format contains the characters Exif in a particular place, as well as a couple of other distinguishing characteristics that have nothing to do with the image they represent — it's also important that a protocol begins with a message that will distinguish it from other protocols. By starting our conversation with something "special" like I45PFP_HELLO, the receiver can be sure that the sender intends to have a conversation using our protocol, rather than something else. (After all, you can connect any program that uses sockets, even a browser, to your picture frame program, though the conversation won't get very far before the picture frame program realizes that it's receiving the wrong kind of traffic and hangs up.)


The planning phase

As in the previous project, you are encouraged to spend some time planning your program before you implement. During that planning phase, it would be wise to write some code that explores aspects of the program that you think will be tricky. (Trust me; there are some aspects of this that will be tricky, but nothing that is insurmountable.) It's often a lot easier to figure out how something works in a simple context, then try to use it in a more complex one; exploratory code is something that I write often when I'm working on something, especially when it uses technologies or libraries that I'm not very familiar with. You know already what things are new to you: you've likely never have used a socket or ran multiple threads, for exmaple. Try them out with a couple of simple "throwaway" programs first, before you tear into your picture frame program.

The project plan

As before, you'll be required to write a project plan that summarizes the planning you did, addressing at least the following questions. (You're free, again, to include any additional information you'd like; spending a little time putting your thoughts down in writing is a good way to get your thoughts organized.)

The project plan is due on Friday, April 23 at 11:59pm. See the section titled "Deliverables" below for more information about how to submit the various parts of your project.


Writing the program

The program should be written entirely in Java. The GUI should be built using the Swing library. In addition to Swing, you are free to use any part of the Java library that you would find helpful, though you are not permitted to use other components (e.g., open source components) on this project.

The program is due on Monday, May 10 at 11:59pm. See the section titled "Deliverables" below for more information about how to submit the various parts of your project.


Assessing the lessons learned

Once you've completed your implementation and submitted it, take a little time to reflect on your experience by writing a lessons learned document. Your lessons learned document should reflect on, at least, the following questions.

The lessons learned document is due on Wednesday, May 12 at 11:59pm. See the section titled "Deliverables" below for more information about how to submit the various parts of your project.


Deliverables

You are required to deliver the three parts of the project to Checkmate, an ICS-built online assignment submission system. Follow this link for a discussion of how to submit files via Checkmate. Be aware that I'll be holding you to all of the rules specified in that document, including the one that says that you're responsible for submitting the version of the project that you want graded. We won't regrade a project simply because you submitted the wrong version accidentally.

There are three parts to this project, each with its own due date:

If you partnered up, be aware that it is only necessary for one of the two partners to submit the project; we would prefer that the same partner submit all three parts, so that they will all be identified together in Checkmate. Your TA is aware of the partnerships and will figure out which project submissions belong to which pairing. Put the names and student IDs of both partners in a comment at the top of each of your .java files and documents. Afterward, take a moment to be sure that you submitted all of the files you intended to; if you missed one, we won't be able to compile and run your program, which can result in a substantial penalty, since we won't be able to evaluate your program's correctness.