RMIT University
Browse

Improving novelty detection for general topics using sentence level information patterns

conference contribution
posted on 2024-10-31, 15:36 authored by Xiaoyan Li, Bruce Croft
The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, the information-pattern concept for novelty detection is presented with the emphasis on new information patterns for general topics (queries) that cannot be simply turned into specific questions whose answers are specific named entities (NEs). Then we elaborate a thorough analysis of sentence level information patterns on data from the TREC novelty tracks, including sentence lengths, named entities, sentence level opinion patterns. This analysis provides guidelines in applying those patterns in novelty detection particularly for the general topics. Finally, a unified pattern-based approach is presented to novelty detection for both general and specific topics. The new method for dealing with general topics will be the focus. Experimental results show that the proposed approach significantly improves the performance of novelty detection for general topics as well as the overall performance for all topics from the 2002-2004 TREC novelty tracks.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1145/1183614.1183652
  2. 2.
    ISBN - Is published in 9781595934338 (urn:isbn:9781595934338)

Start page

238

End page

247

Total pages

10

Outlet

Proceedings of the 15th ACM Conference on Information and Knowledge Management (CIKM 2006)

Editors

Philip S. Yu, Vassilis J. Tsotras, Edward A. Fox and Bing Liu

Name of conference

15th ACM Conference on Information and Knowledge Management (CIKM 2006)

Publisher

ACM

Place published

New York, USA

Start date

2006-11-06

End date

2006-11-11

Language

English

Copyright

© 2006 ACM

Former Identifier

2006024235

Esploro creation date

2020-06-22

Fedora creation date

2012-11-15

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC