RMIT University
Browse

Evaluating engaging clarification questions in information retrieval

Download (20.33 MB)
thesis
posted on 2024-11-25, 19:15 authored by Leila Tavakoli
Information-seeking systems for natural language questions often encounter a range of grammatically complex queries presented in unpredictable ways. Users often need to rephrase their questions in order to obtain a satisfactory answer, which can be both demanding and time-consuming. One solution to this challenge involves prompting clarification questions when a query is intricate or ambiguous. It is widely acknowledged that if a search system can ask clarification questions to better understand the user’s intention, the chances of retrieving a satisfactory answer are higher. While clarification plays a vital role in conversational and interactive information-seeking systems, previous studies have indicated that users are not easily engaged with these clarification questions despite their positive impact. To improve the performance of such models, it is crucial to employ evaluation methods that take into account user behaviour and the characteristics of engaging clarification questions. Currently, there is limited understanding of clarification questions from a user’s perspective, particularly what makes a clarification question engaging. This understanding is crucial since a clarification question is only valuable when the user actively engages with it. To address these knowledge gaps, we conduct a series of experiments to analyse user behaviour when interacting with clarifications on various information-seeking system platforms. Our initial analysis focuses on human-generated clarification questions to gain insights into how they are employed to disambiguate queries and better understand information needs. By identifying the most useful clarification questions, we analyse their characteristics in terms of types and patterns, comparing them with non-useful clarifications. Our analysis reveals that the most useful clarification questions exhibit consistent patterns across different topics. Next, we expand our study to clarification questions in search engines by examining the MIMICS dataset, the only available dataset containing real search clarifications, including information about user engagement and the quality of clarification questions. This research phase aims to investigate the task of identifying the most engaging clarification question from multiple clarifications generated for a given query in a search engine. In cases where multiple clarification questions are available, we frame this task as a learning-to-rank (LTR) problem, utilising various information such as the query itself, clarification questions, candidate answers, and search engine results page (SERP) information. Furthermore, we demonstrate the scarcity of query-clarification pairs with both online and offline evaluations in the dataset, which impedes drawing robust conclusions regarding the impact of online and offline evaluations on search clarification and identifying the most engaging clarification panes from a user’s perspective. Our experiments unveil the limitations of the MIMICS dataset in search clarification, motivating us to introduce a new search clarification dataset called MIMICS-Duo in the subsequent phase. Building upon the MIMICS, MIMICS-Duo facilitates multi-dimensional evaluation of search clarification. This dataset encompasses 306 search queries accompanied by multiple clarifications, fine-grained annotations on clarification questions (including quality and aspect labels) and offline ratings. Using the MIMICS-Duo dataset, we explore further the task of identifying the most engaging clarification question for a given query and extensively investigate the relationship between online and offline evaluations, an area that has been largely unexplored in the existing literature. In contrast to the prevailing belief that offline evaluations are inadequate for supporting online evaluations, we observe that offline evaluations align with online evaluations when it comes to identifying the most engaging clarification question among multiple clarifications generated for a given query. We further investigate the impact of the query length and the low uncertainty in the online evaluation on the relationship between offline and online evaluations. In addition, we explore the impact of human labelling on improving the performance of Large Language and LTR models in identifying the most engaging clarification questions from the user’s point of view. This is achieved by incorporating offline evaluations as input features. We show that LTR models do not outperform individual offline labels. However, GPT stands out as the top performer, surpassing all Learning-to-Rank models and offline labels. Finally, we explore how recent advancements in technology in terms of implementing different modalities in search clarification can enhance user engagement with the clarification questions. Multi-modal clarification approach involves incorporating multiple media types, such as text and image, to refine and enhance search results. We investigate user preferences regarding the modality of clarification and demonstrate that, in most cases, users prefer multi-modal clarifications over those using only one modality. Additionally, we explore the task of automatically generating corresponding images and show that text-to-image generation systems like Stable Diffusion can be utilised to generate multi-modal clarification questions. In conclusion, this research focuses on understanding what makes a clarification question engaging from a user’s perspective, emphasising the need for user engagement to derive value from these questions. Overall, these findings contribute to the advancement of information-seeking systems and provide insights into user behaviour and the characteristics of engaging clarification questions.

History

Degree Type

Doctorate by Research

Imprint Date

2023-01-01

School name

School of Computing Technologies, RMIT University

Former Identifier

9922270808601341

Open access

  • Yes