On Topic Difficulty in IR Evaluation: The Effect of Systems, Corpora, and System Components
conference contribution
posted on 2024-10-31, 22:17authored byFabio Zampieri, Kevin Roitero, Shane CulpepperShane Culpepper, Oren Kurland, Stefano Mizzaro
In a test collection setting, topic difficulty can be defined as the average effectiveness of a set of systems for a topic. In this paper we study the effects on the topic difficulty of: (i) the set of retrieval systems; (ii) the underlying document corpus; and (iii) the system components. By generalizing methods recently proposed to study system component factor analysis, we perform a comprehensive analysis on topic difficulty and the relative effects of systems, corpora, and component interactions. Our findings show that corpora have the most significant effect on topic difficulty.
Funding
Trajectory data processing: Spatial computing meets information retrieval