RMIT University
Browse

A probabilistic method for emerging topic tracking in Microblog stream

journal contribution
posted on 2024-11-02, 02:27 authored by Jiajia Huang, Min Peng, Hua Wang, Jinli Cao, Xiuzhen ZhangXiuzhen Zhang
Microblog is a popular and open platform for discovering and sharing the latest news about social issues and daily life. The quickly-updated microblog streams make it urgent to develop an effective tool to monitor such streams. Emerging topic tracking is one of such tools to reveal what new events are attracting the most online attention at present. However, due to the fast changing, high noise and short length of the microblog feeds, two challenges should be addressed in emerging topic tracking. One is the problem of detecting emerging topics early, long before they become hot, and the other is how to effectively monitor evolving topics over time. In this study, we propose a novel emerging topics tracking method, which aligns emerging word detection from temporal perspective with coherent topic mining from spatial perspective. Specifically, we first design a metric to estimate word novelty and fading based on local weighted linear regression (LWLR), which can highlight the word novelty of expressing an emerging topic and suppress the word novelty of expressing an existing topic. We then track emerging topics by leveraging topic novelty and fading probabilities, which are learnt by designing and solving an optimization problem. We evaluate our method on a microblog stream containing over one million feeds. Experimental results show the promising performance of the proposed method in detecting emerging topic and tracking topic evolution over time on both effectiveness and efficiency.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1007/s11280-016-0390-4
  2. 2.
    ISSN - Is published in 1386145X

Journal

World Wide Web

Volume

20

Issue

2

Start page

325

End page

350

Total pages

26

Publisher

Springer

Place published

United States

Language

English

Copyright

© 2016 Springer Science+Business Media New York

Former Identifier

2006069251

Esploro creation date

2020-06-22

Fedora creation date

2017-06-07

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC