RMIT University
Browse

Chinese OOV translation and post-translation query expansion in Chinese-English cross-language information retrieval

journal contribution
posted on 2024-11-01, 01:48 authored by Ying Zhang, Philip Vines, Justin Zobel
Cross-lingual information retrieval allows users to query mixed-language collections or to probe for documents written in an unfamiliar language. A major difficulty for cross-lingual information retrieval is the detection and translation of out-of-vocabulary (OOV) terms; for OOV terms in Chinese, another difficulty is segmentation. At NTCIR-4, we explored methods for translation and disambiguation for OOV terms when using a Chinese query on an English collection. We have developed a new segmentation-free technique for automatic translation of Chinese OOV terms using the web. We have also investigated the effects of distance factor and window size when using a hidden Markov model to provide disambiguation. Our experiments show these methods significantly improve effectiveness; in conjunction with our post-translation query expansion technique, effectiveness approaches that of monolingual retrieval.

History

Related Materials

  1. 1.
    ISSN - Is published in 15300226

Journal

ACM Transactions on Asian Languange Information Retrieval

Volume

4

Issue

2

Start page

57

End page

77

Total pages

21

Publisher

Association for Computing Machinery

Place published

USA

Language

English

Copyright

© 2005 ACM

Former Identifier

2005000960

Esploro creation date

2020-06-22

Fedora creation date

2009-02-27

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC