Projects & News
Past and ongoing research projects, along with occasional updates.
Modelling Social Dimensions, Perspectivist Style
Two new papers on bringing a perspectivist lens to social meaning in NLP. In "PISCO: Social Dimensions from a Perspectivist Lens" we model social dimensions of language while embracing — rather than collapsing — disagreement between annotators. In "Learning Perspectivist Social Meaning via Demographic-Conditioned Fusion Embeddings" we propose demographic-conditioned fusion embeddings that let models represent how different social groups perceive the same text differently, instead of forcing a single "ground truth" label.
Penelope: Emotions in NLP
A multidisciplinary project bringing together philosophy, psychology and NLP to explore emotion analysis and related tasks. We surveyed the field's recent work, focusing on the frameworks used in NLP research, and showed that large language models exhibit significant gender bias in emotion attribution — not only assigning different emotions to men and women given the same events, but doing so in line with existing stereotypes: men are overwhelmingly angry, women overwhelmingly sad.
Thinking About Harms, Offensiveness and Hate Speech
NLP research has increasingly embraced the notion of annotator subjectivity, motivated by variation in labelling. This approach treats each annotator's view as valid, which can be well suited to tasks that embed subjectivity, such as sentiment analysis. We argue this construction is inappropriate for tasks like hate speech detection, since it affords equal validity to all positions on, e.g., sexism or racism. Conflating hate and offence can invalidate findings on hate speech — future work should be situated in theory, disentangling the two.
Responding to Sexual Harassment
My thesis focused on abuse detection and mitigation in dialogue systems, particularly conversational agents with female personas. Most commercial conversational AI assistants are feminised, and this can reinforce negative stereotypes of women as subservient, as these systems often produce submissive or sexualised responses to abusive prompts.
Let's Chat Ethics (podcast)
I co-host a podcast with Oriana covering all things tech and ethics, featuring guests from across the sector. Listen on Spotify.
Social class in NLP
Since the foundational work of William Labov on the social stratification of language, linguistics has made concentrated efforts to explore the links between socio-demographic characteristics and language production and perception. But while there is strong evidence for socio-demographic characteristics in language, they are infrequently used in NLP. Age and gender are reasonably well represented, but Labov's original target — socioeconomic status — is noticeably absent. We show empirically that NLP disadvantages less-privileged socioeconomic groups, and argue for closing this gap.