the ultimate federated search test collection

The Collection

"FedWeb Greatest Hits" is a large test collection designed for research in Federated Web Search.

It is based on the data used in the TREC Federated Web Search Track (FedWeb'13 and FedWeb'14). The dataset contains large amounts of search results (both the original search snippets and the target pages) for sampled queries, as well as for a set of test topics. FedWeb Greatest Hits also contains lots of additional data, including previously unreleased search results (for extra topics), results screenshots, a large amount of graded relevance judgments for snippets and pages, annotated duplicate pages, and evaluation scripts.

The collection has been made available for researchers. If you want to use it, you can download it, after completing the application form.


The collection has first been introduced in the following WWW'15 paper. Please cite it if you report your experiments using the FedWeb Greatest Hits collection:

T. Demeester, D. Trieschnigg, K. Zhou, D. Nguyen, and D. Hiemstra. FedWeb Greatest Hits: Presenting the New Test Collection for Federated Web Search. In 24th International World Wide Web Conference (WWW 2015), 2015.
(pdf, bibtex)


The FedWeb Greatest Hits collection has been created by:

  • Dolf Trieschnigg, University of Twente (Enschede, The Netherlands)
  • Thomas Demeester, Ghent University (Ghent, Belgium)
  • Adam Zhou, Yahoo! Research (London, United Kingdom)
  • Dong Nguyen, University of Twente (Enschede, The Netherlands)
  • Djoerd Hiemstra, University of Twente (Enschede, The Netherlands)

Please let us know which of your papers make use of the collection, and we will list them on this site.