RMIT University
Browse

Can Deep Effectiveness Metrics Be Evaluated Using Shallow Judgment Pools?

conference contribution
posted on 2024-10-31, 20:52 authored by Xiaolu Lu, Alistair Moffat, Shane CulpepperShane Culpepper
Increasing test collection sizes and limited judgment budgets create measurement challenges for IR batch evaluations, challenges that are greater when using deep effectiveness metrics than when using shallow metrics, because of the increased likelihood that unjudged documents will be encountered. Here we study the problem of metric score adjustment, with the goal of accurately estimating system performance when using deep metrics and limited judgment sets, assuming that dynamic score adjustment is required per topic due to the variability in the number of relevant documents. We seek to induce system orderings that are as close as is possible to the orderings that would arise if full judgments were available. Starting with depth-based pooling, and no prior knowledge of sampling probabilities, the first phase of our two-stage process computes a background gain for each document based on rank-level statistics. The second stage then accounts for the distributional variance of relevant documents. We also exploit the frequency statistics of pooled relevant documents in order to determine a threshold for dynamically determining the set of topics to be adjusted. Taken together, our results show that: (i) better score estimates can be achieved when compared to previous work; (ii) by setting a global threshold, we are able to adapt our methods to different collections; and (iii) the proposed estimation methods reliably approximate the system orderings achieved when many more relevance judgments are available. We also consider pools generated by a two-strata sampling approach.

Funding

Trajectory data processing: Spatial computing meets information retrieval

Australian Research Council

Find out more...

History

Start page

35

End page

44

Total pages

10

Outlet

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval 2017

Name of conference

SIGIR 2017

Publisher

ACM

Place published

New York, New York

Start date

2017-08-07

End date

2017-08-11

Language

English

Copyright

© 2017 Copyright held by the owner/author(s).

Former Identifier

2006076326

Esploro creation date

2020-06-22

Fedora creation date

2017-08-15

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC