Democratizing BERT classifier fine-tuning and more: ActiveTigger, an open-source collaborative text annotation tool for the computational social sciences
Abstract
ActiveTigger is an open-source software tool designed to support collaborative text annotation for computational social scientists. Developed at CREST by the CSS@IPP group led by Etienne Ollion (https://www.css.cnrs.fr/active-tigger), it aims to facilitate the annotation of text corpora through a user-friendly collaborative interface, backed by machine learning features such as training classifiers, fine-tuning language models (BERT), evaluating performance, and running predictions on large datasets. In particular, it implements active learning as an intermediary step between human annotation and machine learning. This presentation will introduce the current state of the tool — close to its first stable release — through a live demo, and then discuss the roadmap ahead. Since the project's inception, new questions have emerged, notably around the growing ubiquity of generative models, which prompts us to rethink evolving use cases and what a robust workflow for the social sciences should look like.
About this workshop
The aim of this workshop is to promote technical and practical exchanges between researchers who use NLP methods. There is no hesitation in detailing the code (r/python), sharing tips, and discovering new methods and models.
Periodicity: Thursdays from 12h15 to 13h30, by videoconference.
To attend, please fill the form.