The ontological politics of synthetic data: Normalities, outliers, and intersectional hallucinations
Journal article, 2025

Synthetic data is increasingly used as a substitute for real data due to ethical, legal, and logistical reasons. However, the rise of synthetic data also raises critical questions about its entanglement with the politics of classification and the reproduction of social norms and categories. This paper aims to problematize the use of synthetic data by examining how its production is intertwined with the maintenance of certain worldviews and classifications. We argue that synthetic data, like real data, is embedded with societal biases and power structures, leading to the reproduction of existing social inequalities. Through empirical examples, we demonstrate how synthetic data tends to highlight majority elements as the “normal” and minimize minority elements, and that the slight changes to the data structures that create synthetic data will also inevitably result in what we term “intersectional hallucinations.” These hallucinations are inherent to synthetic data and cannot be entirely eliminated without compromising the purpose of creating synthetic datasets. We contend that decisions about synthetic data involve determining which intersections are essential and which can be disregarded, a practice which will imbue these decisions with norms and values. Our study underscores the need for critical engagement with the mathematical and statistical choices in synthetic data production and advocates for careful consideration of the ontological and political implications of these choices during curatorial style production of synthetic structured data.

intersectionality

data ethics

classification

data bias

Synthetic structured data

ontological politics

Author

Francis Lee

Linköping University

Chalmers, Technology Management and Economics, Science, Technology and Society

Saghi Hajisharif

Linköping University

Ericka Johnson

Linköping University

Big Data and Society

20539517 (eISSN)

Vol. 12 2

Subject Categories (SSIF 2025)

Social and Economic Geography

DOI

10.1177/20539517251318289

More information

Latest update

4/23/2025