RMIT University
Browse

Extending BM25 with multiple query operators

conference contribution
posted on 2024-10-31, 21:20 authored by Roi Blanco Gonzalez, Paolo Boldi
Traditional probabilistic relevance frameworks for informational retrieval refrain from taking positional information into account, due to the hurdles of developing a sound model while avoiding an explosion in the number of parameters. Nonetheless, the well-known BM25F extension of the successful Okapi ranking function can be seen as an embryonic attempt in that direction. In this paper, we proceed along the same line, defining the notion of virtual region: a virtual region is a part of the document that, like a BM25F-field, can provide a (larger or smaller, depending on a tunable weighting parameter) evidence of relevance of the document; differently from BM25F fields, though, virtual regions are generated implicitly by applying suitable (usually, but not necessarily, positional-aware) operators to the query. This technique fits nicely in the eliteness model behind BM25 and provides a principled explanation to BM25F; it specializes to BM25(F) for some trivial operators, but has a much more general appeal. Our experiments (both on standard collections, such as TREC, and on Web-like repertoires) show that the use of virtual regions is beneficial for retrieval effectiveness.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1145/2348283.2348406
  2. 2.
    ISBN - Is published in 9781450314725 (urn:isbn:9781450314725)

Start page

921

End page

930

Total pages

10

Outlet

Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval 2012

Name of conference

SIGIR '12

Publisher

ACM

Place published

United States

Start date

2012-08-12

End date

2012-08-16

Language

English

Copyright

© 2012 ACM

Former Identifier

2006077402

Esploro creation date

2020-06-22

Fedora creation date

2017-08-28

Usage metrics

    Scholarly Works

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC