the ultimate federated search test collection

Relevance judgments

The folder meta_data/judgments contains the following files:

  • FW13-double-page-judgments.txt
  • FW13-double-snippet-judgments.txt
  • FW13-single-judgments.txt
  • FW14-single-page-judgments.txt
  • FW14-single-page-judgments-onlineeval.txt

These files contain a number of parameters related to the annotation process. More specifically, each line (below the 2-line header) in FW13-double-page-judgments.txt contains lines with the following fields (in this order):

  1. snippetID (e.g., FW13-e001-7004-01): ID of the considered page
  2. userID1 (judge7): anonymized user ID for first user
  3. label1 (Non): categorical relevance label ('Non', 'Rel', 'HRel', 'Key', 'Nav')
  4. watchedvideo1 (0): 1 if assessor 1 watched (part of) a video, else 0
  5. problem1 (0): 1 if assessor 1 encountered a problem (page not correctly found in crawled screenshots, nor the html results, nor live)
  6. userID2 (judge5): anonymized user ID for second judge
  7. label2 (Rel): as for label1, but by the second judge
  8. watchedvideo2 (0): as for watchedvideo1, but by the second judge
  9. problem2 (0): as for problem1, but by the second judge

FW13-double-snippet-judgments.txt has a similar format, except that 'watchedvideo1' and 'watchedvideo2' are missing, and the labels are snippet labels instead of page labels. Snippet labels denote the estimated page relevance as perceived from the user, after only observing the snippet (i.e., not the page). Apart from the possible page labels ('Non', 'Rel', 'HRel', 'Key', 'Nav'), snippets potentially have the extra label 'Ans' (if the snippet directly answers the query).

FW13-single-page-judgments.txt contains both snippet and page information, for the single reference assessor.

FW14-single-page-judgments.txt contains only page information (as for FedWeb14, no snippets were judged), for the 50 official test topics.

FW14-single-page-judgments-onlineeval.txt is similar, but for the 10 extra online-evaluation topics.

Information can be found in the respective TREC FedWeb'13 and FedWeb'14 overview papers.

More information about on how these judgments were gathered can be found in the 2013 overview paper, the 2014 overview paper, or by contacting Thomas Demeester.

Note : The number of judgments in these files is slightly lower than the number originally announced (in the www'15 paper announcing the collection). The reason is another filtering step of the judgments, to remove judgments from search engines no longer included in the official collection (in particular: those judgments for the search engines contributing to the BigWeb engine e200, which were not selected for BigWeb).