Inf 43: Introduction to Software Engineering

Fall, 2014
Homework 2: Version Control with Git
Part A due Tuesday, November 18, 11:55pm
Part B due Tuesday, November 25, 11:55pm
Part C due Thursday, December 4, 5:00pm


Homework 2 will introduce you to the concepts and software behind version control, using the example of Git.

The assignment consists of several parts. 


Part A

Part A focuses on installing Git and becoming familiar with its basic concepts and commands.

There's no question that installing software can be tricky. If you run into problems, post them on Piazza with as much detail as possible, and help will soon be on the way.

  1. Read Pro Git, by Scott Chacon, all of chapter 1 except 1.5 and 1.6, and sections 2.1-2.4 of chapter 2.
  2. Choose how you will use Git:
  3. Note: Bash and Terminal (on a Mac) are examples of a "shell", a Unix concept that has proved quite durable. In a shell (or command prompt window) you can enter commands which cause actions, such as listing, creating, and deleting files. Some useful commands that work in the Git Bash shell and in Terminal:
  4. Run the command git help, and take a screen print showing the results. (On a Windows machine, use Alt-PrintScreen to put the active window image in the clip board. On a Mac, see this page.) Paste the screen print into a Word file
  5. GitHub is a widely used location to store git repositories. You'll probably need a GitHub login name in your later courses or in your software engineering career. If you don't have an account on GitHub.com, create one at https://github.com/. Put your GitHub username at the top of the Word file from the previous step.
  6. Upload your Word file to the EEE Drop Box "Homework 2A" before Tuesday, November 18, 11:55pm.


Part B

In Part B you will create a local repository and perform basic operations on it.


  1. Create (somewhere on your computer) a new empty folder named Inf43Hw2. In that folder, create a plain text file with your name and student ID # on a single line. Don't use MS Word for this, use a text editor, for example Notepad or TextEdit (you may need to choose Plain Text under Format or Preferences). Save the file as file1.txt.
  2. In your Git shell, navigate to Inf43Hw2. Use the cd command to change your current folder/directory. Note that the Windows Git Bash shell follows Unix/Linux shell conventions so if you're on Windows, you still need to use Linux-style paths with forward slashes (e.g., c:\my_folder\my_subfolder would be /c/my_folder/my_subfolder). Linux commands like ls, pwd, and grep should all work in the Git Bash shell.
  3. Note: At this point you may find that Git wants you to tell it your name and email address. You can do this with two commands like these: 
  4. Create a local Git repo by running the command git init.
  5. Run git status. Note that file1.txt is listed as untracked. We want Git to track it, so run git add file1.txt. When you "add" a file you are telling Git to keep track of it. "add" also tells Git to stage the file, which means put it in the stage of being ready to be committed. 
  6. Run git status again. Note that file1.txt is now listed as a file to be committed (i.e., it's staged).
  7. Let's commit file1.txt to our repository. Run git commit -m "Committing a new file with my name". When you "commit,", you in effect copy all staged files to the repository. The "-m" is a flag (that's what the hyphen indicates) which tells Git that the following string is a message to record with the commit.
  8. Run git log. This will display the history of changes made to the repository. The one and only entry will be for the commit of file1.txt you just did.
  9. Edit file1.txt and change the spelling of your name to something incorrect. Save file1.txt with the error. (This small error stands in for a long complex series of edits that you want to undo.)
  10. Run git reset --hard. "reset --hard" removes all uncommitted changes, so all files in the repo will return to their contents as of the latest commit. There are many ways to undo changes in git, and "reset --hard" is generally considered to be dangerous. Look at file1.txt and observe the effect of reset --hard.
  11. Edit file1.txt to remove your student ID# and include the name of your major, and save the file.
  12. Commit with git commit -m "Now has my major". This doesn't work. Git tells you there are "changes not staged for commit".
  13. Try again with git commit -a -m "Now has my major". The power of the "-a" flag is that it tells git to automatically stage all tracked, modified files before the commit.
  14. You can also explicitly stage a file. Add the name of your favorite restaurant and favorite movie to to file1.txt, save it, and run git stage file1.txt. Now run git commit -m "Added favorite restaurant" to commit. "git stage" is really just another name for "git add".
  15. You set the commit message to "Added favorite restaurant", but the file also includes your favorite movie, so maybe we should have included that in our commit message. Amend your commit message with git commit --amend -m "Added favorite restaurant and movie".
  16. Run git log to make sure you have successfully changed history.
  17. You removed your student ID# a few steps back. Let that edit stand in for deleting, a few months ago, a block of code that you now want to examine. git will help you go back in time. Note that each commit has long, seemingly random, string of hexadecimal digits associated with it. This is called a "hash" and is a unique identifier for the commit. Find the hash associated with the "My first backup" commit. Run git checkout xxxx, replacing xxxx with the first four digits from that hash (thankfully typing in the entire hash is not required). You will see a frightening message about a detached HEAD.
  18. git can keep track of separate, parallel, streams of edits to a project. Each stream of edits is called a branch, and a branch can have a name. For instance, multiple programmers who are working on and commiting changes to the same file will probably establish different branches. HEAD is git-ese for the current (not necessarily the last) commit in the current branch. Since we've gone back in time and are potentially (but haven't yet) starting a new branch, HEAD is "detached" (from any established, named branch). Ouch!
  19. Take a look at file1.txt and note the later-deleted Student ID#. Now to return to the present: git checkout master. "master" is the name of the default branch created when the repository was made. Look at file1.txt again. Run git log again and you'll see it has the same three commits.
  20. Create a new text file called file2.txt that contains your expected graduation year and first job title on a single line.
  21. Stage file2.txt, and then commit it with a useful message.
  22. Run git log. Notice that you see log entries for both commits that you've performed.
  23. Run git log file2.txt. Notice that you only see the log entry involving file2.txt.
  24. Modify file1.txt to have the name of your favorite color on a new line.
  25. Delete file2.txt.
  26. Run git status. Note that file2.txt is listed as deleted. Also note that the status information helpfully says "git add/rm ..." to update what will be committed.
  27. You use the git add command to stage a new or modified file. However, to stage the deletion of a file, you need to use the git rm command. So run the commands git add file1.txt and git rm file2.txt to set the stage.
  28. Commit the changes with the commit message "Deleting file2.txt". 
  29. Run the command git log -p -3. The -p flag will show you the diffs for each change. The -3 will limit what's displayed to the last 3 log entries. (If the output ends with a colon, see the box on the right.) Take a few minutes to look carefully at the output log and see if you can figure out how to interpret it.
  30. Now you decide you actually wanted to keep file2.txt, but you deleted it! Fortunately, you had added it to git, so you can still get it back. There are several ways to do this. The simplest is probably to use the command git checkout HEAD~1 file2.txt. What does this do? HEAD represents the most recent commit or snapshot. ~1 tells Git to go back one version from the most recent snapshot (i.e., HEAD). In this older snapshot, file2.txt still existed, and checkout tells Git to retrieve it. If you now look in your folder, you'll see file2.txt. And is file1.txt changed?
  31. Run git add file2.txt to stage file2.
  32. Run git status. Note that file2.txt is staged. Commit it with the commit message "Re-adding file2.txt".
  33. Run the command git log -p to see all of the log entries.
  34. Now run the command git log -p > git_log_partB.txt. (The > is a shell command that redirects the output of the program on >'s left to the file named on >'s right.)
  35. Open git_log_partB.txt. It should like just like the output you saw for step 32. If you're on Windows and viewing it in Notepad, the spacing will probably look wrong, so try opening it in a different text editor (like Wordpad or Notepad++).
  36. Upload git_log_partB.txt to the EEE Drop Box "Homework 2B" before Tuesday, November 25, 11:55pm.


Part C

Much of Git's power comes from the interaction between your local repository and remote repositories. Remote Git repositories help you collaborate with others on a software project.

There are several services out there that will host a remote git repository for you for free, the most popular of which is GitHub.com. GitHub.com is particularly popular among the open source community, and there are pieces of software you probably use every day that are hosted on GitHub.com. Reddit is one such software (web application) that is maintained on Github

  1. Let's use git to view the Reddit source code repository hosted on GitHub. Start git, use cd to navigate to an empty folder, and run the command git clone https://github.com/reddit/reddit.git.
  2. You now have a local copy (on your computer) of the remote repository. It's important to understand that this is not only a copy of the source code, but also a copy of the history of changes stored by git. And since it's a local repository, all the commands you used in the previous part of the assignment will work. To try this out, run the command cd reddit to go inside the reddit project folder, then run git log. This shows you the most recent log messages for changes made to the Reddit source code.
  3. Here's another variation on the git log command to try: git log -1 -p --before='2014-03-31 11:52:45'. This is nothing new except for the --before='2014-03-31 11:52:45' part. That tells git you only want to see log entries for changes made before March 31, 2014 at 11:52:45am. As you saw in part B, the -1 means you only want to see one entry, and the -p means you want to see a diff of the changes. As you can see, a contributor named "Andre D" changed the mouseover property to "cancelTimeout" from "queueShow" in the file r2/r2/public/static/js/saved.js.
  4. Since we have a complete copy of the Reddit project's repository, we also have a copy of every snapshot going all the way back to the beginning of the project. To see the log entries for the earliest commits, run the command git log --reverse.
  5. Recall that each commit is given a unique hash (aka SHA1). Many commands in Git can take a hash as input. For example, try git log -1 4778b17e939e119417cc5ec25b82c4e9a65621b2 and git show 4778b17e939e119417cc5ec25b82c4e9a65621b2. (Don't forget that you can use only the first four digits of the hash. If git complains that the short SHA1 is ambiguous (because more than one commit has the same first four digits), try adding a few more digits from the long hash.)
  6. One more git log option to know about is --skip=N, where N is a non-negative integer. This means to skip N commits before starting to show the commit output. Try git log --skip=100 -5.
  7. Run a log showing five commits to Reddit, skipping the first N commits, where N is the first three digits of your UCI student id number. For instance, if your id is 12345678, then N is 123. Leading zeros are no problem. Redirect the log output to a text file and turn in a printed copy of the file by Thursday, December 4, 5:00pm in lecture.
  8. If you wanted (and had the permissions), you could modify part of the Reddit source, commit your changes (on your local repo), and then use git push to copy your local branch back to the GitHub repository. That's beyond the scope of Informatics 43, but it's good to know that git is a sophisticated tool that facilitates multiple people updating shared files, logging their updates, and assisting with the resolution of updates that conflict.