RMIT University
Browse

Compact set representation for information retrieval

conference contribution
posted on 2024-10-31, 09:04 authored by Shane CulpepperShane Culpepper, A MOFFAT
Conjunctive Boolean queries are a fundamental operation in web search engines. These queries can be reduced to the problem of intersecting ordered sets of integers, where each set represents the documents containing one of the query terms. But there is tension between the desire to store the lists effectively, in a compressed form, and the desire to carry out intersection operations efficiently, using non-sequential processing modes. In this paper we evaluate intersection algorithms on compressed sets, comparing them to the best non-sequential array-based intersection algorithms. By adding a simple, low-cost, auxiliary index, we show that compressed storage need not hinder efficient and high-speed intersection operations.

History

Start page

137

End page

148

Total pages

12

Outlet

Proceedings of the 14th International Symposium, String Processing and Information Retrieval (SPIRE 2007)

Editors

Nivio Ziviani; Ricardo Baeza-Yates

Name of conference

String Processing and Information Retrieval (SPIRE 2007) - 14th International Symposium

Publisher

Springer-Verlag

Place published

Berlin, Germany

Start date

2007-10-29

End date

2007-10-31

Language

English

Copyright

© 2007 Springer-Verlag Berlin Heidelberg

Former Identifier

2006013170

Esploro creation date

2020-06-22

Fedora creation date

2013-03-24

Usage metrics

    Scholarly Works

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC