Human versus GPT-4 in qualitative analysis: A comparative reanalysis of patient interview data following anterior cruciate ligament injury rehabilitation
Artikel i vetenskaplig tidskrift, 2026
Results: While the human-made analysis produced one overarching theme supported by three main categories and nine sub-categories, GPT-4′s analysis resulted in four themes, six main categories, and 15 sub-categories. Both analyses captured uncertainty and the impact of knee-related symptoms. GPT-4′s results showed a suspiciously equal distribution of codes across sub-categories, and introduced a theme not grounded in the source data. Multiple prompts were required to produce and organize the material.
Conclusion: The analysis performed by humans and GPT-4 had similarities and differences. The use of GPT-4 for qualitative analysis in its present form is challenging and needs to be performed across several steps. Currently, GPT-4 should not be used as the only tool in a qualitative analysis of interview data.
Qualitative research
Language processing
Rehabilitation
Författare
Ramana Piussi
Sahlgrenska universitetssjukhuset
Göteborgs universitet
Justin Schneiderman
Göteborgs universitet
Yinan Yu
Chalmers, Data- och informationsteknik, Funktionell programmering
Kristian Samuelsson
Sahlgrenska universitetssjukhuset
Göteborgs universitet
Eric Hamrin Senorski
Göteborgs universitet
Sahlgrenska universitetssjukhuset
Knee
0968-0160 (ISSN) 18735800 (eISSN)
Vol. 60 104388Ämneskategorier (SSIF 2025)
Ortopedi
Artificiell intelligens
DOI
10.1016/j.knee.2026.104388
PubMed
41707572