A Nugget-based Information Retrieval Evaluation Paradigm

Last Modified: October 13, 2017

Award:   NSF IIS-1256172
PI:   Javed A. Aslam
Institution:   Northeastern University


Evaluating information retrieval systems, such as search engines, is critical to their effective development. Current performance evaluation methodologies are generally variants of the Cranfield paradigm, which relies on effectively complete, and thus prohibitively expensive, relevance judgment sets: tens to hundreds of thousands of documents must be judged by human assessors for relevance with respect to dozens to hundreds of user queries, at great cost both in time and expense.

The project instead investigates a new information retrieval evaluation paradigm based on nuggets. The thesis is that while it is likely impossible to find all relevant documents for a query with respect to web-scale and/or dynamic collections, it is much more tractable to find all or nearly all relevant information, with which one can then perform effective and reusable evaluation, at scale and with ease. These atomic units of relevant information are referred to as ``nuggets'', and one instantiation of these nuggets is simply the sentence or short passage that causes a judge to deem a document relevant at the time of document assessment. At evaluation time, relevance assessments are dynamically created for documents based on the quantity and quality of relevant information found in the documents retrieved. This new evaluation paradigm is inherently scalable and permits the use of all standard measures of retrieval performance, including those involving graded relevance judgments, novelty, diversity, and so on; it further permits new kinds of evaluations not heretofore possible.


Past and Affiliated Personnel

Publications and Follow-on Work

TREC Temporal Summarization Track

NTCIR MobileClick Track

NTCIR 1Click-2 Track

Acknowledgment and Disclaimer

This material is based upon work supported by the National Science Foundation under Grant No. IIS-1256172. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).