The spread of misinformation online significantly impacts public discourse and decision making across various domains. This thesis aims to develop deep learning models, particularly novel transfer learning techniques, for automatic detection of misinformation on social media platforms. Our first three studies focus on detecting rumours, which are unverified claims that circulate on social media. Briefly, we investigate (1) how we can adapt a rumour detection system from one language to another language without training data in the target language; (2) how we can improve rumour detection by incorporating not only content of the rumours but also the conversation structure of crowd responses and user relations; and (3) how we can develop rumour detection systems that are interpretable and provide insights into the factors contributing to the classification outcome. Our fourth
study looks at detecting state-sponsored trolls, malicious actors who aim to manipulate public opinion by spreading misleading or inflammatory content on social media. We explore few-shot learning techniques to develop troll detection systems that can effectively identify these users with limited training data.
Most existing models for rumour detection are tailored to a single language or specific event, raising the question of how to effectively transfer knowledge from one language or event to another. Our first study focuses on two important aspects of rumour detection: cross-domain transfer learning in the context of the COVID-19 infodemic and cross-lingual transfer learning. Firstly, we explore the application of rumour detection systems to new topics or events. We train our rumour detection system on a set of topics and events and then apply it to the specific case of COVID-19 pandemic. Secondly, we propose a crosslingual transfer learning framework that uses multilingual pretrained models (PLMs) as the backbone to adapt knowledge from a source language to a target language. We found that rumour detection systems can be applied to novel events or domains with some success, and including user responses is especially important.
Current systems to detect rumours on social media usually look at the text content or how quickly a post spreads. But it is still not clear how to use these signals all together effectively. To combine multiple signals, in our second study, we propose a hybrid model that fuses content features with language models alongside graph networks to encode the dynamic conversation structures of crowd responses and user relations.
While deep learning approaches greatly improved the performance of automatic rumour detection, the interpretability of these models remains a concern (i.e. how a model arrives at the classification decision is opaque to us). Our third study adapts causal mediation analysis, a method grounded in causal inference, to explain the decision-making processes of rumour detection systems. Our findings indicate that causal mediation analysis can uncover key comments that lead to a model decision which aligns with human judgements, demonstrating a novel paradigm to add a layer of interpretability and transparency to deep-learning based rumour detection systems.
In our last study, we focus on the task of detecting state-sponsored trolls, key operatives in social media influence campaigns. Current troll detection models, often trained on data from specific, known campaigns, struggle to adapt to newand unseen influence campaigns. To address this problem, we propose a meta-learning troll detection framework, which learns campaign-specific knowledge with minimal data and storing them in parameters that are not rewritten as the detection system is continually updated for new campaigns. Our detection system relies primarily on the textual content that a troll posts as core detection signal, but we further extend the framework for multimodal troll detection to incorporate images, broadening its capacity and effectiveness in combating misinformation.<p></p>