Interpretable Text Classification Applied to the Detection oLLM-Generated Creative Writing
Paper i proceeding, 2026

We consider the problem of distinguishing human-written creative fiction (excerpts from novels) from similar text generated by an LLM. Our results show that, while human observers perform poorly (near chance levels) on this binary classification task, a variety of machine-learning models achieve accuracy in the range 0.93-0.98 over a previously unseen test set, even using only short samples and single-token (unigram) features. We therefore employ an inherently interpretable (linear) classifier (with a test accuracy of 0.98), in order to elucidate the underlying reasons for this high accuracy. In our analysis, we identify specific unigram features indicative of LLM-generated text, one of the most important being that the LLM tends to use a larger variety of synonyms, thereby skewing the probability distributions in a manner that is easy to detect for a machine learning classifier, yet very difficult for a human observer. Four additional explanation categories were also identified, namely, temporal drift, Americanisms, foreign language usage, and colloquialisms. As identification of the AI-generated text depends on a constellation of such features, the classification appears robust, and therefore not easy to circumvent by malicious actors intent on misrepresenting AI-generated text as human work.

Text Classification

Creative Fiction

Interpretable AI

Plagiarism Detection

Författare

Minerva Suvanto

Fordonsteknik och autonoma system

Andrea McGlinchey

Edinburgh Napier University

Mattias Wahde

Vehicle Engineering and Autonomous Systems

Peter J. Barclay

Edinburgh Napier University

International Conference on Agents and Artificial Intelligence

21843589 (ISSN) 2184433X (eISSN)

Vol. 2 1198-1209
9789897587962 (ISBN)

18th International Conference on Agents and Artificial Intelligence, ICAART 2026
Marbella, Spain,

Ämneskategorier (SSIF 2025)

Språkbehandling och datorlingvistik

DOI

10.5220/0014236000004052

Mer information

Senast uppdaterat

2026-06-22