RMIT University
Browse

Author name disambiguation for ranking and clustering PubMed data using NetClus

conference contribution
posted on 2024-10-31, 16:00 authored by Arvin Varadharajalu, Wei Liu, Wilson Wong
The ranking and clustering of publication databases are often used to discover useful information about research areas. NetClus is an iterative algorithm for clustering heterogenous star-schema information network that incorporates the ranking information of individual data types. The algorithm has been evaluated using the DBLP database. In this paper, we apply NetClus on PubMed, a free database of articles on life sciences and biomedical topics to discover key aspects of cancer research. The absence of unique identifiers for authors in PubMed introduces additional challenges. To address this, we introduce an improved author disambiguation technique using affiliation string normalisation based on vector space model together with co-author networks. Our technique for disambiguating authors, which offers a higher accuracy than existing techniques, significantly improves NetClus clustering results.

History

Start page

152

End page

161

Total pages

10

Outlet

24th Australasian Joint Conference on Artificial Intelligence (AI2011)

Editors

D. Wang and M. Reynold

Name of conference

24th Australasian Joint Conference on Artificial Intelligence (AI2011)

Publisher

Springer

Place published

Berlin, Germany

Start date

2011-12-05

End date

2011-12-08

Language

English

Copyright

© Springer-Verlag Berlin Heidelberg 2011

Former Identifier

2006027232

Esploro creation date

2020-06-22

Fedora creation date

2012-08-30

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC