RMIT University
Browse

Sentence length bias in TREC novelty track judgements

conference contribution
posted on 2024-10-31, 17:09 authored by Lorena Leal Bando, Falk ScholerFalk Scholer, Andrew Turpin
The Cranfield methodology for comparing document ranking systems has also been applied recently to comparing sentence ranking methods, which are used as pre-processors for summary generation methods. In particular, the TREC Novelty track data has been used to assess whether one sentence ranking system is better than another. This paper demonstrates that there is a strong bias in the Novelty track data for relevant sentences to also be longer sentences. Thus, systems that simply choose the longest sentences will often appear to perform better in terms of identifying "relevant" sentences than systems that use other methods. We demonstrate, by example, how this can lead to misleading conclusions about the comparative effectiveness of sentence ranking systems. We then demonstrate that if the Novelty track data is split into subcollections based on sentence length, comparing systems on each of the subcollections leads to conclusions that avoid the bias.

History

Start page

55

End page

61

Total pages

7

Outlet

Proceedings of the 17th Australasian Document Computing Symposium (ADCS 2012)

Editors

Andrew Trotman

Name of conference

ADCS 2012

Publisher

ACM

Place published

New York, NY, USA

Start date

2012-12-05

End date

2012-12-06

Language

English

Copyright

© 2012 ACM

Former Identifier

2006040530

Esploro creation date

2020-06-22

Fedora creation date

2013-04-15

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC