RMIT University
Browse

Lightweight multilingual entity extraction and linking

conference contribution
posted on 2024-10-31, 21:15 authored by Aasish Pappu, Roi Blanco Gonzalez, Yashar Mehdad, Amanda Stent, Kapil Thadani
Text analytics systems often rely heavily on detecting and linking entity mentions in documents to knowledge bases for downstream applications such as sentiment analysis, question answering and recommender systems. A major challenge for this task is to be able to accurately detect entities in new languages with limited labeled resources. In this paper we present an accurate and lightweight, multilingual named entity recognition (NER) and linking (NEL) system. The contributions of this paper are three-fold: 1) Lightweight named entity recognition with competitive accuracy; 2) Candidate entity retrieval that uses search click-log data and entity embeddings to achieve high precision with a low memory footprint; and 3) efficient entity disambiguation. Our system achieves state-of-the-art performance on TAC KBP 2013 multilingual data and on English AIDA CONLL data.

History

Start page

365

End page

374

Total pages

10

Outlet

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM 2017)

Editors

Maarten de Rijke, Milad Shokouhi

Name of conference

WSDM 2017

Publisher

Association for Computing Machinery

Place published

New York, United States

Start date

2017-02-06

End date

2017-02-10

Language

English

Copyright

Copyright © 2017 Association for Computing Machinery (ACM)

Former Identifier

2006077302

Esploro creation date

2020-06-22

Fedora creation date

2017-08-28

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC