A Nugget-based Information Retrieval Evaluation Paradigm
Last Modified: October 13, 2017
|Javed A. Aslam
Evaluating information retrieval systems, such as search engines, is
critical to their effective development. Current performance
evaluation methodologies are generally variants of the Cranfield
paradigm, which relies on effectively complete, and thus prohibitively
expensive, relevance judgment sets: tens to hundreds of thousands of
documents must be judged by human assessors for relevance with respect
to dozens to hundreds of user queries, at great cost both in time and
The project instead investigates a new information retrieval
evaluation paradigm based on nuggets. The thesis is that while it is
likely impossible to find all relevant documents for a query with
respect to web-scale and/or dynamic collections, it is much more
tractable to find all or nearly all relevant information, with which
one can then perform effective and reusable evaluation, at scale and
with ease. These atomic units of relevant information are referred to
as ``nuggets'', and one instantiation of these nuggets is simply the
sentence or short passage that causes a judge to deem a document
relevant at the time of document assessment. At evaluation time,
relevance assessments are dynamically created for documents based on
the quantity and quality of relevant information found in the
documents retrieved. This new evaluation paradigm is inherently
scalable and permits the use of all standard measures of retrieval
performance, including those involving graded relevance judgments,
novelty, diversity, and so on; it further permits new kinds of
evaluations not heretofore possible.
Past and Affiliated Personnel
- Jesse Anderton (graduate student)
- Maryam Bashir (graduate student)
- Peter Golbus (graduate student)
- Shahzad Rajput (graduate student)
Publications and Follow-on Work
A Comprehensive Method for Automating Test Collection Creation and Evaluation for Retrieval and Summarization Systems
PhD Thesis, College of Computer and Information Science, Northeastern University, 2017.
A Study of Realtime Summarization Metrics
In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pages 2125-2130. ACM Press, 2016.
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences?
In Proceedings of the Seventh International Workshop on Evaluating Information Access (EVIA), pages 29-32. National Institute of Informatics (NII), 2016.
TREC 2015 Temporal Summarization Track Overview
In Proceedings of the The Twenty-Fourth Text REtrieval Conference, NIST Special Publication:
SP 500-319, 2015.
Overview of the NTCIR-11 MobileClick Task
In Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, National Institute of Informatics (NII), 2014.
TREC 2014 Temporal Summarization Track Overview
In Proceedings of the The Twenty-Third Text REtrieval Conference, NIST Special Publication:
SP 500-308, 2014.
Overview of the NTCIR-10 1CLICK-2 Task
In Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, National Institute of Informatics (NII), 2013.
Exploring Semi-automatic Nugget Extraction for Japanese One Click Access Evaluation
In Proceedings of the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 749-752. ACM Press, 2013.
Live Nuggets Extractor: A Semi-automated System for Text Extraction and Test Collection Creation
In Proceedings of the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1087-1088. ACM Press, 2013.
TREC 2013 Temporal Summarization
In Proceedings of the The Twenty-Second Text REtrieval Conference, NIST Special Publication:
SP 500-302, 2013.
Constructing Test Collections by Inferring Document Relevance via Extracted Relevant Information
In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM), pages 145-154. ACM Press, 2012.
TREC Temporal Summarization Track
NTCIR MobileClick Track
NTCIR 1Click-2 Track
Acknowledgment and Disclaimer
This material is based upon work supported by the National Science
Foundation under Grant No. IIS-1256172. Any opinions, findings and
conclusions or recommendations expressed in this material are those of
the author(s) and do not necessarily reflect the views of the National
Science Foundation (NSF).