RMIT University
Browse

Top-k keyword search over probabilistic XML data

conference contribution
posted on 2024-10-31, 18:15 authored by Jianxin Li, Chengfei Liu, Rui Zhou, Wei Wang
Despite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results.

History

Start page

673

End page

684

Total pages

12

Outlet

Proceedings of 2011 IEEE 27th International Conference on Data Engineering, ICDE 2011

Editors

A. Kemper and W. Nejdl

Name of conference

2011 IEEE 27th International Conference on Data Engineering, ICDE 2011

Publisher

IEEE

Place published

United States

Start date

2011-04-11

End date

2011-04-16

Language

English

Copyright

© 2011 IEEE

Former Identifier

2006049875

Esploro creation date

2020-06-22

Fedora creation date

2015-01-20

Usage metrics

    Scholarly Works

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC