473.2
Understanding Social Media. Use of Machine Learning (ML) in Qualitative Data Analysis.
The purpose of this article is to present the process of automating coding of texts from social media. The implementation of this process allows for quantitative treatment of qualitative methods: analysis on the corpora of hundreds thousands of texts based on their meaning. The process is possible through algorithms of machine learning (ML).
The example of the hate speech designation project in texts from Polish online forums is presented. The first step is to gather the largest database of texts using key words. This part was carried out using commercial tools to collect the texts.
The key issue is the precise of conceptualization and operationalization of individual research categories. This allows for preparing specific instructions and conducting the training code unit. As a result we get higher rates of inter-coder agreement. Marked texts will be used as training data for automated categorization methods based on ML algorithms.
Then we describe the course of machine coding. This article also seeks to establish problems associated with automatic coding of hate speech and propose solutions. In summary, we point the factors that are crucial to the research process that uses machine learning.