Adapting and Enhancing AI Tools
Exploiting disinformation signals and datasets to improve AI tools
State of art: Annotated training datasets are essential for deploying machine learning systems, but their construction is costly and time-consuming. Though the TITAN coach will exploit disinformation signals to engage the citizen, accuracy of these signals is not critical, in the sense that the citizen can provide feedback to correct inaccurate signals. Despite the low performance requirement and driven by the availability of feedback, TITAN will enhance existing and prototype tools for new signals. The research community has led different dataset generation initiatives over the years for benchmarking purposes, in different settings (ranging from science to news to crime and healthcare/COVID19 misinformation), but also comprehensive data repositories. Indicative datasets cover short statements (e.g., LIAR, FEVER), posts (e.g., BuzzFeedNews, BuzzFace, PHEME), entire articles (e.g., FakeNewsNet).
Challenge: TITAN exploits user feedback as a low-to-minimal supervision signal, to create labelled datasets that can be used to adapt and enhance existing tools integrated into TITAN’s ecosystem, support the piloting and evaluation of novel disinformation signals, and to stimulate further research in the area of disinformation, without further manual supervision beyond user feedback.
Going beyond: In TITAN we exploit the synergy between self-supervision, and active learning. Active Learning achieves better performance with less labelled examples, if Machine Learning algorithms are allowed to select the data it learns from. TITAN will enhance accuracy of existing tools and application scope by generalising them beyond their original training, relying on the large-scale user feedback and streams of fact-checked content. Special attention will be given to the argument mining/argument resolution tools, to expand coverage and accuracy, relaying in both user feedback (either from individual citizens or collaborating citizens) and active learning (and more specifically Active Transfer Learning, which has been successfully applied in the acquisition of argument mining corpora).