[1] The entire HTML source is currently saved, to allow for experimentation in feature extraction methods.

[2] In addition, it contains a function to retrieval and locally store the HTML source of all links accessible from the current page. Syskil & Webert analyzes the HTML source of a page to determine whether the page matches the user's profile. To avoid network transmission overhead during our experiments, we prefetch all pages.