论文信息 - Content analysis of big qualitative data

Content analysis of big qualitative data

When working with big data in science (research databanks, literature reviews) and everyday life (news aggregators), there is a need for mining, classifying and storing information. Information is defined as data in a processed form. The methodology of content analysis in its various forms, qualitative (manual coding), quantitative (words frequencies and co-occurrences) and mixed methods (creation of ad hoc dictionaries based on substitution), offers a tool to address this issue. Interest in content analysis emerged as early as in the 1970s, yet it remains relatively unknown outside of sociology, linguistics and communication studies. Content analysis allows converting qualitative data (texts, images) into digital format (vectors and matrices) and subsequent manipulating digital information using linear algebra, multidimensional scaling and other tools from natural sciences. The conversion into digital formal also paves the way to machine learning. Supervised machine learning looks particularly promising since it implies keeping focus on interpretation of data proper to interpretative sociology. Supervised machine learning is compatible with mixed methods content analysis. The existing program for computer-assisted content analysis (QDA Miner, Atlas TI, NVivo etc.) have several limitations. Restrictions on the number of their users (coders) refer to one of the limitations. The creation of on-line platforms for content analysis allows bypassing this and some other limitations. The idea of creating an on-line databank for qualitative data and a platform for content analyzing it is discussed. In contrast to quantitative data, qualitative research data is rarely available for secondary analysis.

Anton Oleinik