RMIT University
Browse

Optimization strategies for complex queries

conference contribution
posted on 2024-10-31, 15:33 authored by Trevor Strohman, Howard Turtle, Bruce Croft
Previous research into the efficiency of text retrieval systems has dealt primarily with methods that consider inverted lists in sequence; these methods are known as term-at-a-time methods. However, the literature for optimizing documentat- a-time systems remains sparse. We present an improvement to the max score optimization, which is the most efficient known document-at-a-time scoring method. Like max score, our technique, called term bounded max score, is guaranteed to return exactly the same scores and documents as an unoptimized evaluation, which is particularly useful for query model research. We simulated our technique to explore the problem space, then implemented it in Indri, our large scale language modeling search engine. Tests with the GOV2 corpus on title queries show our method to be 23% faster than max score alone, and 61% faster than our document-at-a-time baseline. Our optimized query times are competitive with conventional termat- a-time systems on this year's TREC Terabyte task.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1145/1076034.1076074
  2. 2.
    ISBN - Is published in 1595930345 (urn:isbn:1595930345)

Start page

219

End page

225

Total pages

7

Outlet

Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2005)

Name of conference

SIGIR 2005

Publisher

ACM

Place published

New York, USA

Start date

2005-08-15

End date

2005-08-19

Language

English

Copyright

© 2005 ACM

Former Identifier

2006024169

Esploro creation date

2020-06-22

Fedora creation date

2013-02-19

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC