De-Biasing Large Language Models: Fine-Tuning with in-Group Texts for Enhanced Sociocultural Representations

Friday, 11 July 2025: 12:00
Location: FSE024 (Faculty of Education Sciences (FSE))
Oral Presentation
Sukru ATSIZELTI, Koç University, Turkey
Ali HÜRRIYETOĞLU, Koç University, Turkey
Erdem YORUK, Koç University, Turkey
Fırat DURUŞAN, Koç University, Turkey
Fuat KINA, Marmara University, Turkey
Melih Can YARDI, Koç University, Turkey
Şule TAN, Bogazici University, Turkey
Large language model-based simulation studies are based on the assumption that these models reflect real societal patterns (Argyle, 2023). However, they face significant challenges, such as essentializing groups by attributing uniform decisions to them, erasing in-group minority perspectives, and causing misrepresentation. One potential reason for this issue is the training data: group-related texts are often written by outgroup members, which may reflect prevalent biases and prejudices (Wang, Morgenstern & Dickerson, 2024). Our previous attempts to address this problem by incorporating ingroup-written texts into the prompts produced limited results (Barkhordar & Atsızelti, 2024). This study aims to reduce bias in LLM-based simulations by fine-tuning the models —using models such as LLaMA 3.2, LLaMA 3.1, Phi 3.5, Gemma 2, Mistral Small, Ollama, Mistral v0.3, Orpo, and DPO Zephyr— with texts written by in-group members. To achieve this, we will use gold-standard annotations of Turkish ideology tweets as ingroup texts. As part of the process, we will develop LLMs fine-tuned for specific ideologies. To evaluate the quality of the outputs generated by these models, we will use three methods: a) simulations based on a survey we designed last year, b) a word pairs axis constructed from antonyms, and c) evaluations of the texts produced by fine-tuned models according to the ideology annotation manual produced by the team.

References

Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337-351.

Barkhordar, E., & Atsızelti, Ş. (2024). Assessing the predictive power of social media data-fed large language models on voter behavior. In WebSci Companion ’24, Stuttgart, Germany. ACM. https://doi.org/10.1145/3630744.3659831.

Wang, A., Morgenstern, J., & Dickerson, J. P. (2024). Large language models cannot replace human participants because they cannot portray identity groups. arXiv preprint arXiv:2402.01908.