RMIT University
Browse

Examining the limits of crowdsourcing for relevance assessment

journal contribution
posted on 2024-11-01, 10:49 authored by Paul Clough, Mark SandersonMark Sanderson, Jiayu Tang, Tim Gollins, A Warner
Evaluation is instrumental in the development and management of effective information retrieval systems and ensuring high levels of user satisfaction. Using crowdsourcing to obtain relevance assessments has been shown to be viable through a number of publications. What is less well understood are the limits of crowdsourcing for the assessment task, particularly for domain specific search. We present results comparing relevance assessments gathered using crowdsourcing with those gathered from a domain expert for evaluating different search engines in a large government archive. While crowdsourced judgments rank the tested search engines in the same order as expert judgments, crowdsourced workers appear unable to distinguish different levels of highly accurate search results in a way that expert assessors can. The nature of this limitation in crowd sourced workers for this experiment is examined and the viability of crowdsourcing for evaluating search in specialist settings is discussed.

History

Related Materials

  1. 1.
    DOI - Is published in 10.1109/MIC.2012.95
  2. 2.
    ISSN - Is published in 10897801

Journal

IEEE Internet Computing

Volume

17

Issue

4

Start page

32

End page

38

Total pages

7

Publisher

IEEE

Place published

United States

Language

English

Copyright

© 2012 IEEE

Former Identifier

2006034659

Esploro creation date

2020-06-22

Fedora creation date

2013-02-11

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC