TY - GEN
T1 - The Consistent Lack of Variance of Psychological Factors Expressed by LLMs and Spambots
AU - Varadarajan, Vasudha
AU - Giorgi, Salvatore
AU - Mangalik, Siddharth
AU - Soni, Nikita
AU - Markowitz, David M.
AU - Schwartz, H. Andrew
N1 - Publisher Copyright: © 2025 International Conference on Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - In recent years, the proliferation of chatbots like ChatGPT and Claude has led to an increasing volume of AI-generated text. While the text itself is convincingly coherent and human-like, the variety of expressed of human attributes may still be limited. Using theoretical individual differences, the fundamental psychological traits which distinguish people, this study reveals a distinctive characteristic of such content: AI-generations exhibit remarkably limited variation in inferrable psychological traits compared to human-authored texts. We present a review and study across multiple datasets spanning various domains. We find that AI-generated text consistently models the authorship of an "average" human with such little variation that, on aggregate, it is clearly distinguishable from human-written texts using unsupervised methods (i.e., without using ground truth labels). Our results show that (1) fundamental human traits are able to accurately distinguish human- and machine-generated text and (2) current generation capabilities fail to capture a diverse range of human traits.
AB - In recent years, the proliferation of chatbots like ChatGPT and Claude has led to an increasing volume of AI-generated text. While the text itself is convincingly coherent and human-like, the variety of expressed of human attributes may still be limited. Using theoretical individual differences, the fundamental psychological traits which distinguish people, this study reveals a distinctive characteristic of such content: AI-generations exhibit remarkably limited variation in inferrable psychological traits compared to human-authored texts. We present a review and study across multiple datasets spanning various domains. We find that AI-generated text consistently models the authorship of an "average" human with such little variation that, on aggregate, it is clearly distinguishable from human-written texts using unsupervised methods (i.e., without using ground truth labels). Our results show that (1) fundamental human traits are able to accurately distinguish human- and machine-generated text and (2) current generation capabilities fail to capture a diverse range of human traits.
UR - https://www.scopus.com/pages/publications/105000116271
M3 - Conference contribution
T3 - Proceedings - International Conference on Computational Linguistics, COLING
SP - 111
EP - 119
BT - GenAIDetect 2025 - Proceedings of the 1st Workshop on GenAI Content Detection, Proceedings of the Workshop - 31st International Conference on Computational Linguistics, COLING 2025
A2 - Alam, Firoj
A2 - Nakov, Preslav
A2 - Habash, Nizar
A2 - Gurevych, Iryna
A2 - Chowdhury, Shammur
A2 - Shelmanov, Artem
A2 - Wang, Yuxia
A2 - Artemova, Ekaterina
A2 - Kutlu, Mucahid
A2 - Mikros, George
PB - Association for Computational Linguistics (ACL)
T2 - 1st Workshop on GenAI Content Detection, GenAIDetect 2025
Y2 - 19 January 2025
ER -