RMIT University
Browse

Similarity measures for tracking information flow

conference contribution
posted on 2024-10-30, 14:37 authored by Donald Metzler, Yaniv Bernstein, Bruce Croft, Alistair Moffat, Justin Zobel
Text similarity spans a spectrum, with broad topical similarity near one extreme and document identity at the other. Intermediate levels of similarity ? resulting from summarization, paraphrasing, copying, and stronger forms of topical relevance ? are useful for applications such as information ow analysis and question-answering tasks. In this paper, we explore mechanisms for measuring such intermediate kinds of similarity, focusing on the task of identifying where a particular piece of information originated. We consider both sentence-to-sentence and document-to-document comparison, and have incorporated these algorithms into RECAP, a prototype information ow analysis tool. Our experimental results with RECAP indicate that new mechanisms such as those we propose are likely to be more appropriate than existing methods for identifying the intermediate forms of similarity.

History

Start page

517

End page

524

Total pages

8

Outlet

Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM '05)

Editors

A. Chowdhury et al.

Name of conference

ACM International Conference on Information and Knowledge Management

Publisher

ACM

Place published

New York, USA

Start date

2005-10-31

End date

2005-11-05

Language

English

Copyright

Copyright 2005 ACM

Former Identifier

2005001122

Esploro creation date

2020-06-22

Fedora creation date

2011-06-20

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC