RMIT University
Browse

Improving the accuracy of system performance estimation by using shards

conference contribution
posted on 2024-11-03, 13:50 authored by Nicola Ferro, Mark SandersonMark Sanderson
We improve the measurement accuracy of retrieval system performance by better modeling the noise present in test collection scores. Our technique draws its inspiration from two approaches: one, which exploits the variable measurement accuracy of topics; the other, which randomly splits document collections into shards. We describe and theoretically analyze an ANalysis Of VAriance (ANOVA) model able to capture the effects of topics, systems, and document shards as well as their interactions. Using multiple TREC collections, we empirically confirm theoretical results in terms of improved estimation accuracy and robustness of found significant differences. The improvements compared to widely used test collection measurement techniques are substantial. We speculate that our technique works because we do not assume that the topics of a test collection measure performance equally.

Funding

Finding answers for complex questions

Australian Research Council

Find out more...

History

Related Materials

  1. 1.
    DOI - Is published in 10.1145/3331184.3338062
  2. 2.
    ISBN - Is published in 9781450361729 (urn:isbn:9781450361729)

Start page

805

End page

814

Total pages

10

Outlet

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019)

Name of conference

SIGIR 2019

Publisher

Association for Computing Machinery

Place published

United States

Start date

2019-07-21

End date

2019-07-25

Language

English

Copyright

© 2019 Association for Computing Machinery.

Former Identifier

2006106442

Esploro creation date

2021-08-11

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC