Winter 2014

"An introduction to information retrieval including indexing, retrieval, classifying, and clustering text and multimedia documents." (catalog)

Instructor: Professor Don Patterson
Lecture: TuTh: 11:00 - 12:20
Classroom: MSTB 120
Discussion Section: W: 4:00 - 4:50
Classroom: MSTB 118
Office Hours: Fridays 1:30 - 2:30pm in DBH 5011.
Piazza Link:
Teaching Assistant: Tao Wang
Office Hours:Friday 10:00am - 11:00am in LUCI Lab

Textbook:An Introduction to Information Retrieval by Manning,Raghavan,Sch├╝tze

Class Attendance / Participation (2 dropped) ~20%
Assignments ~45%
Quizzes ~35%

As the class progresses I may find it necessary to alter the percentages.

This class is a lot of work. Here are some things you are going to be learning:
  • Crawling the web
  • Indexing the data
  • Graph traversal and storage
  • Page Rank
  • Hadoop

I prefer to give many small assignments which build up a picture of overall student learning success rather than to rely on one or two large exams which students may bomb based on non-learning related complications.

At the end of the day, learning is the responsibility of the student. I consider myself someone who points students in the right direction and can/will explain the fundamentals of a subject matter. I can't actually do the work of learning for a student. That takes effort and self-initiative. I will help to provide structure and motivation for that learning, but you also need to learn how to expand on this subject yourself. In a technical field like this, you will be left behind the field in about six months, regardless of how well I present the subject matter, if you can't keep learning on your own.

I like to stop talking periodically and let students ask questions.

Class attendance will be determined by completing index cards. The index cards are also a means for me to get feedback about the course.

For each class please write your name on a card, the date, your student ID and a comment about the course.

If you would like to submit an anonymous comment, take an extra card and don't put your name on it.

Collecting feedback this way is a useful and different source of input than other media. It helps me track how the class is understanding the material. Something about paper causes students to say different things than they do on a website.

We utilize 3 online tools to administer the class. Please familiarize yourself with their use and location:

  1. EEE ( This is the UCI-run suite of online resources that are used primarily to keep track of your grades, issue quizzes and surveys, and to conduct some communication.
  2. Piazza ( This is the primary communication mechanism for the course. This is a Q and A site that enables rapid feedback to the entire class as well as individual messaging.
  3. Web Page( This is the authoratative location for communicating assignments, materials from the class and due dates.

Several "lab" assignments will be assigned consisting primarily of self-directed learning tasks.

The goal of these will be to give you a chance to familiarize yourself with creating basic software technologies for modern search engines. Rather than producing extensive deliverables the focus is on learning to teach yourself from on-line resources how to build components. This will hopefully form the basis of being able to create more extensive projects in the future.

Several quizzes will be assigned covering the assigned readings and discussions in class. The goal of the quizzes is to motivate you to process and learn the material in the text book and lecture.