RMIT University
Browse

An in-depth study on diversity evaluation: The importance of intrinsic diversity

journal contribution
posted on 2024-11-02, 04:36 authored by Hai-Tao Yu, Adam Jatowt, Roi Blanco Gonzalez, Hideo Joho, Joemon Jose
Diversified document ranking has been recognized as an effective strategy to tackle ambiguous and/or underspecified queries. In this paper, we conduct an in-depth study on diversity evaluation that provides insights for assessing the performance of a diversified retrieval system. By casting the widely used diversity metrics (e.g., ERR-IA, α-nDCG and D#-nDCG) into a unified framework based on marginal utility, we analyze how these metrics capture extrinsic diversity and intrinsic diversity. Our analyses show that the prior metrics (ERR-IA, α-nDCG and D#-nDCG) are not able to precisely measure intrinsic diversity if we merely feed a set of subtopics into them in a traditional manner (i.e., without fine-grained relevance knowledge per subtopic). As the redundancy of relevant documents with respect to each specific information need (i.e., subtopic) can not be then detected and solved, the overall diversity evaluation may not be reliable. Furthermore, a series of experiments are conducted on a gold standard collection (English and Chinese) and a set of submitted runs, where the intent-square metrics that extend the diversity metrics through incorporating hierarchical subtopics are used as references. The experimental results show that the intent-square metrics disagree with the diversity metrics (ERR-IA and α-nDCG) being used in a traditional way on top-ranked runs, and that the average precision correlation scores between intent-square metrics and the prior diversity metrics (ERR-IA and α-nDCG) are fairly low. These results justify our analyses, and uncover the previously-unknown importance of intrinsic diversity to the overall diversity evaluation.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1016/j.ipm.2017.03.001
  2. 2.
    ISSN - Is published in 03064573

Journal

Information Processing and Management

Volume

53

Issue

4

Start page

799

End page

813

Total pages

15

Publisher

Elsevier

Place published

United Kingdom

Language

English

Copyright

© 2017 Elsevier Ltd

Former Identifier

2006077234

Esploro creation date

2020-06-22

Fedora creation date

2017-08-29

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC