Can Large Language Models (or Humans) Disentangle Text?
Paper in proceeding, 2024

We investigate the potential of large language models (LLMs) to disentangle text variables—to remove the textual traces of an undesired forbidden variable in a task sometimes known as text distillation and closely related to the fairness in AI and causal inference literature. We employ a range of various LLM approaches in an attempt to disentangle text by identifying and removing information about a target variable while preserving other relevant signals. We show that in the strong test of removing sentiment, the statistical association between the processed text and sentiment is still detectable to machine learning classifiers post-LLM-disentanglement. Furthermore, we find that human annotators also struggle to disentangle sentiment while preserving other semantic content. This suggests there may be limited separability between concept variables in some text contexts, highlighting limitations of methods relying on text-level transformations and also raising questions about the robustness of disentanglement methods that achieve statistical independence in representation space.

disentanglement

computational social science

representations

natural language processing

large language models

machine learning

Author

Nicolas Pietro Marie Audinet De Pieuchon

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Adel Daoud

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Connor T. Jerzak

The University of Texas at Austin

Moa Johansson

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Richard Johansson

Chalmers, Computer Science and Engineering (Chalmers), Data Science

Proceedings of the Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024)

57-67
979-8-89176-112-4 (ISBN)

Sixth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS 2024)
Mexico City, Mexico,

Subject Categories

Language Technology (Computational Linguistics)

DOI

10.18653/v1/2024.nlpcss-1.5

More information

Created

12/3/2024