RMIT University
Browse

Fast and space-efficient entity linking in queries

conference contribution
posted on 2024-10-31, 21:03 authored by Roi Blanco Gonzalez, Giuseppe Ottaviano, Edgar Meij
Entity linking deals with identifying entities from a knowledge base in a given piece of text and has become a fundamental building block for web search engines, enabling numerous downstream improvements from better document ranking to enhanced search results pages. A key problem in the context of web search queries is that this process needs to run under severe time constraints as it has to be performed before any actual retrieval takes place, typically within milliseconds. In this paper we propose a probabilistic model that leverages user-generated information on the web to link queries to entities in a knowledge base. There are three key ingredients that make the algorithm fast and space-efficient. First, the linking process ignores any dependencies between the different entity candidates, which allows for a O(k2) implementation in the number of query terms. Second, we leverage hashing and compression techniques to reduce the memory footprint. Finally, to equip the algorithm with contextual knowledge without sacrificing speed, we factor the distance between distributional semantics of the query words and entities into the model. We show that our solution significantly outperforms several state-of-the-art baselines by more than 14% while being able to process queries in sub-millisecond times---at least two orders of magnitude faster than existing systems.

History

Start page

179

End page

188

Total pages

10

Outlet

Proceedings of the 8th ACM International Conference on Web Search and Data Mining (WSDM 2015)

Name of conference

WSDM 2015

Publisher

Association for Computing Machinery

Place published

New York, United States

Start date

2015-01-31

End date

2015-02-06

Language

English

Copyright

© 2016 Association for Computing Machinery (ACM)

Former Identifier

2006077383

Esploro creation date

2020-06-22

Fedora creation date

2017-09-13

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC