wedge Reading 03-01:Filtered Document retrieval with frequency-sorted indexes
wedge Commentary from : A. Moffat, J. Zobel, and D. Hawking, “Recommended reading for ir research students,” SIGIR Forum, vol. 39, no. 2, pp. 3–14, 2005.
* Commentary: This paper (and the preliminary version of it by the first author in SIGIR’94) took up and ran with the idea of structuring the index to handle ranked queries as the number one goal, rather than Boolean ones. This simple change allowed a range of efficiency improvements, including dynamic pruning techniques. Other work then followed, suggesting other non-document based orderings for inverted lists. Anyone studying IR implementation needs to visit this paper, as the starting point for a whole thread of logical development. meets the same set of needs as those addressed by IR more generally, that is, the management of and access to large document sets in a meaningful and useful manner.
wedge Online Copy
* www3.interscience.wiley.com—issue
wedge Local Copy
* Kumar2000
wedge Reading 03-02: The webgraph framework I: compression techniques
wedge Abstract:
* Studying web graphs is often difficult due to their large size. Recently,several proposals have been published about various techniques that allow tostore a web graph in memory in a limited space, exploiting the inner redundancies of the web. The WebGraph framework is a suite of codes, algorithms and tools that aims at making it easy to manipulate large web graphs. This papers presents the compression techniques used in WebGraph, which are centred around referentiation and intervalisation (which in turn are dual to each other). WebGraph can compress the WebBase graph (118 Mnodes, 1 Glinks)in as little as 3.08 bits per link, and its transposed version in as littleas 2.89 bits per link.
wedge Online Copy
* doi.acm.org—988672.988752
wedge Local Copy
* Bharat
wedge Reading 03-03: Chapter 4
wedge Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008.
* Online copy:www-csli.stanford.edu—information-retrieval-book.html
* Local copy: www-csli.stanford.edu—information-retrieval-book.html