To appear
in the Proceedings of the ACM CHI'96 Conference, April, 1996.
Do-I-Care: A Collaborative Web
Agent
Brian Starr
Mark S. Ackerman
Michael Pazzani
Information and Computer Science
University of California, Irvine
{bstarr, ackerman, pazzani}@ics.uci.edu
http://www.ics.uci.edu/CORPS/dica.html
Social filtering and collaborative resource discovery mechanisms often fail
because of the extra burden, even tiny, placed on the user. This work proposes
an innovative World Wide Web agent that uses a model of collaboration that
leverages the natural incentives for individual users to easily provide for
collaborative work.
KEYWORDS: computer-supported cooperative work, CSCW, social filtering,
collaboration, World Wide Web
Collaborative efforts to discover Internet resources, such as interesting Web
pages, require work by users. Unfortunately, users often will not do the
extra, altruistic work, leading to adoption failures for social filtering
systems.
In the work described below, individuals engage in resource discovery for
themselves. Without extra work, other individuals can use that effort.
We do this by using a naturally reoccurring problem with the Web. After one
has discovered Web resources, a problem remains: How do you know when to
revisit those resources for new material? Consider some common
post-discovery situations:
- You know the page where a colleague has his publications. You would like
to obtain new papers as they are posted. Furthermore, you would like to know
about interesting papers, rather than just any papers.
- You have obtained the page for the low cost clothes merchandiser. You
would like to know when there is a sale on jeans or Hawaiian shirts.
These
are examples of change events that occur somewhat unpredictably and offer no
explicit notification. Today, interested parties must occasionally check by
hand. This can be burdensome and tedious, and one may forget to do it on a
timely basis. Furthermore, we would like an agent to discover
interesting changes, rather than some insignificant change.
In order to track interesting changes, an agent needs to know what changes are
interesting and where to find them. This paper is an overview of an agent that
collaborates with both users and their peers to identify potentially
interesting changes. The agent works by soliciting user opinions about changes
it finds to train a user model; an individual does this for her own good. The
agent can then share these opinions and findings with agents of like-minded
users, providing for collaborative effort. This agent, Do-I-Care, and its
model of collaboration are described below.
There has been considerable effort devoted to the resource discovery problem on
the Web, mostly focusing on three types of technical methods. The first, active
browser systems (e.g., Fish Search and WebWatcher), offer navigational
suggestions that direct the user along a path. User models are obtained
directly from users while they are browsing and require their active
participation.
The second, index creation and lookup systems (e.g., Harvest and Lycos), treat
the Web as a database of information to be cataloged and indexed. Users must
be able to articulate their information goals and may need to periodically
resubmit their searches to obtain new or future information.
There are also several agents that monitor Web pages for any
modification, such as URL-minder. However, many of the changes detected may be
irrelevant. Also, the user must still discover the Web sites to be monitored.
Although technical discovery methods are valuable, they neglect perhaps the
single most important method of discovery that people rely on -- other people.
Social resource discovery assumes the existence of others, who have located and
evaluated relevant resources. Opinions of like-minded individuals are also
assumed to have more discriminatory value than an automated evaluation. The
goal of social resource discovery systems is to aggregate and share the fruits
of individual activity and knowledge.
Relatively few social resource discovery systems currently exist. The Pointers
system [3] facilitates the distribution of links to resources with accompanying
context. While the benefits to a pointer's recipient seem clear, the system
relies on a provider's desire to be helpful that may not always exist.
Our work is closer in emphasis to Ringo [6] and GroupLens [5]. GroupLens uses
newsgroup article ratings to predict an article's relevance to a given user.
Because the weights assigned to other's opinions are proportionate to their
past correlation with a user's opinions, there is some benefit for users to
offer opinions. However, in both systems, the emphasis is on providing
collaborative effort, for unclear personal gain.
The Do-I-Care agent attempts to avoid free-loading by exploiting the naturally
occurring individual effort in a way that facilitates information sharing
without incurring any significant additional costs. It is similar in its
incentives to Answer Garden [1].
Do-I-Care was designed to help users discover interesting changes on the
Web, using both technical and social means. Do-I-Care agents automate periodic
visits to selected pages to detect interesting changes on behalf of
individuals. They use machine learning to identify by example what changes are
interesting, and how often they occur. It is in users' best interests to keep
their personal agents informed about relevant pages and about the quality of
reported changes. In so doing, they can automatically contribute to the common
good, because Do-I-Care agents can also share information with other agents.
Because of space limitations, we can only briefly discuss the system
architecture. Each Do-I-Care agent must:
- Periodically visit a user-defined list of target pages.
- Identify any changes since the last time it visited.
- Decide whether the changes were interesting
- Notify the user if the change was interesting.
- Accept relevance feedback on the interestingness of the change and
timeliness of the notification.
- Facilitate information sharing and collaboration between individuals and
groups.
We assume that Web pages generally change incrementally and that
the user has a list of Web pages that will have interesting information. By
limiting the search to a small subset of Web pages specifically selected
because they are likely to contain interesting information, we can
greatly improve precision. By accepting or rejecting Do-I-Care's suggestions,
the user refines the types of changes that are interesting.
Since users provide relevance feedback, we can view the process of determining
whether a change is interesting as a classification task, and the process of
learning a user profile is a supervised learning task. We represent each change
by a relatively small number of features. We find words that are informative
by computing the mutual information [4] between the presence of a word in the
change and the relevance feedback. In addition, we use other change
attributes, such as size and the number of links. A simple Bayesian classifier
[2] then determines whether the change is interesting.
Because different pages may be interesting for different reasons, users
may maintain multiple Do-I-Care agents specialized on particular topics or
kinds of Web pages. For example, the criteria for an interesting journal
announcement may be different from that for a news item.
Once an agent spots an interesting change, the user is notified by e-mail, and
the change is appended to the agent's associated Web page. This Web page is
also used for relevance feedback and collaborative activity.
Since all changes are reported through an agent's Web page, agents can use
information from other agent's pages for their own change notifications. The
technical mechanism is simple. Users who wish to allow their agent's Web page
to be cascaded simply notify others of its existence. Cascaded agents' owners
continue to use their agents as before, but now they are also sharing both
their lists of interesting sites and their opinions about changes with others.
Agents can be cascaded any number of times, allowing the creation of group or
organizational "new and interesting" pages.
Users of cascaded agents benefit by leveraging the work and judgment of
trusted peers. Gathering changes from cascaded agents increases recall, while
exploiting other's filtering increases precision. There is little social cost,
since no additional work is generated for anyone. Thereby, the perceived costs
and benefits favor collaboration.
Do-I-Care attempts to solve two post-discovery problems: when should a user
revisit a known site for new information, and how does a user share new
information with others who may be interested. These post-discovery chores are
laborious, suffer from inequitable distribution of costs and benefits, and are
not both addressed by existing systems. By providing a simple technical
mechanism, we have found a collaboration mechanism that provides for group
effort through individual incentives.
1. Ackerman, Mark S. Answer Garden: A Tool for Growing Organizational Memory.
Dissertation, Massachusetts Institute of Technology, 1993.
2. Duda, R., and P. Hart. Pattern classification and scene analysis. New
York: John Wiley and Sons, 1973.
3. Maltz, D., and K. Ehrlich. Pointing The Way: Active Collaborative
Filtering. CHI 95: 202-209.
4. Quinlan, J. R. Induction of decision trees. Machine Learning, 1
(1), 1986: 81-106.
5. Resnick, P., N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. GroupLens.
Proceedings of CSCW 94: 175-186
6. Shardanand, Upendra, and Pattie Maes. Social Information Filtering.
Proceedings of CHI'95: 210-217.