automatic query hardness estimation

Jensen-Shannon divergence

By appropriately employing Jensen-Shannon divergence to measure the “diversity” of the returned results, we demonstrate a methodology for predicting query difficulty whose performance exceeds existing state-of-the-art techniques on TREC collections, often remarkably so.

Query Hardness. We consider the issue of query performance, and we propose a novel method for automatically predicting the difficulty of a query. Unlike a number of existing techniques which are based on examining the ranked lists returned in response to perturbed versions of the query with respect to the given collection or perturbed versions of the collection with respect to the given query, our technique is based on examining the ranked lists returned by multiple scoring functions (retrieval engines) with respect to the given query and collection. Below is a table summarizing previous work on query hardness.

Results

On the right, there are displayed base line results obtained previously by different researchers

Below: JS query hardness prediction results using 10 input systems for TREC8. Each dot in these scatter plots corresponds to a query. The x-axis is actual query hardness as measured by query average AP (left), query median AP (center), and median-system AP (right). The y-axis is the Jensen-Shannon divergence computed over the ranked results returned by five randomly chosen systems for that query.

Hardness Hypothesis.

In essence, we propose that the results returned by multiple retrieval engines will be relatively similar for ``easy'' queries but more diverse for ``difficult'' queries.