RMIT University
Browse

Query Driven Algorithm Selection in Early Stage Retrieval

conference contribution
posted on 2024-10-31, 22:04 authored by Joel MacKenzie, Shane CulpepperShane Culpepper, Roi Blanco, Matt Crane, Charles Clarke, Jimmy Lin
Large scale retrieval systems often employ cascaded ranking architectures, in which an initial set of candidate documents are iteratively refined and re-ranked by increasingly sophisticated and expensive ranking models. In this paper, we propose a unified framework for predicting a range of performance-sensitive parameters based on minimizing end-to-end effectiveness loss. The framework does not require relevance judgments for training, is amenable to predicting a wide range of parameters, allows for fine tuned efficiency-effectiveness trade-offs, and can be easily deployed in large scale search systems with minimal overhead. As a proof of concept, we show that the framework can accurately predict a number of performance parameters on a query-by-query basis, allowing efficient and effective retrieval, while simultaneously minimizing the tail latency of an early-stage candidate generation system. On the 50 million document ClueWeb09B collection, and across 25,000 queries, our hybrid system can achieve superior early-stage efficiency to fixed parameter systems without loss of effectiveness, and allows more finely-grained efficiency-effectiveness trade-offs across the multiple stages of the retrieval system.

Funding

Trajectory data processing: Spatial computing meets information retrieval

Australian Research Council

Find out more...

History

Related Materials

  1. 1.
    DOI - Is published in 10.1145/3159652.3159676
  2. 2.
    ISBN - Is published in 9781450355810 (urn:isbn:9781450355810)

Start page

396

End page

404

Total pages

9

Outlet

Proceedings of the 11th International Conference on Web Search and Data Mining

Name of conference

WSDM 2018

Publisher

ACM

Place published

United States

Start date

2018-02-05

End date

2018-02-09

Language

English

Copyright

© the authors

Former Identifier

2006083034

Esploro creation date

2020-06-22

Fedora creation date

2018-09-20

Usage metrics

    Scholarly Works

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC