RMIT University
Browse

Efficient and Effective Higher Order Proximity Modeling

conference contribution
posted on 2024-10-31, 20:06 authored by Xiaolu Lu, Alistair Moffat, Shane CulpepperShane Culpepper
Bag-of-words retrieval models are widely used, and provide a robust trade-off between efficiency and effectiveness. These models often make simplifying assumptions about relations between query terms, and treat term statistics independently. However, query terms are rarely independent, and previous work has repeatedly shown that term dependencies can be critical to improving the effectiveness of ranked retrieval results. Among all term-dependency models, the Markov Random Field (MRF) [Metzler and Croft, SIGIR, 2005] model has received the most attention in recent years. Despite clear effectiveness improvements, these models are not deployed in performance-critical applications because of the potentially high computational costs. As a result, bigram models are generally considered to be the best compromise between full term dependence, and term-independent models such as BM25. Here we provide further evidence that term-dependency features not captured by bag-of-words models can reliably improve retrieval effectiveness. We also present a new variation on the highly-effective MRF model that relies on a BM25-derived potential. The benefit of this approach is that it is built from feature functions which require no higher-order global statistics. We empirically show that our new model reduces retrieval costs by up to 60%, with no loss in effectiveness compared to previous approaches.

Funding

Beyond keyword search for ranked document retrieval

Australian Research Council

Find out more...

Efficient and effective ad-hoc search using structured and unstructured geospatial information

Australian Research Council

Find out more...

Data retrieval from massive information structures

Australian Research Council

Find out more...

History

Start page

21

End page

30

Total pages

10

Outlet

Proceedings of the 6th ACM International Conference on the Theory of Information Retrieval (ICTIR 2016)

Name of conference

ICTIR 2016

Publisher

New York, United States

Place published

New York, NY

Start date

2016-09-12

End date

2016-09-16

Language

English

Copyright

© 2016 Copyright is held by the owner/author(s). Publication rights licensed to Association for Computing Machinery (ACM)

Former Identifier

2006069070

Esploro creation date

2020-06-22

Fedora creation date

2016-12-20

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC