Subjects on objects in contexts: Using GICA method to quantify epistemological subjectivity

A substantial amount of subjectivity is involved in how people use language and conceptualize the world. Computational methods and formal representations of knowledge usually neglect this kind of individual variation. We have developed a novel method, Grounded Intersubjective Concept Analysis (GICA), for the analysis and visualization of individual differences in language use and conceptualization. The GICA method first employs a conceptual survey or a text mining step to elicit from varied groups of individuals the particular ways in which terms and associated concepts are used among the individuals. The subsequent analysis and visualization reveals potential underlying groupings of subjects, objects and contexts. One way of viewing the GICA method is to compare it with the traditional word space models. In the word space models, such as latent semantic analysis (LSA), statistical analysis of word-context matrices reveals latent information. A common approach is to analyze term-document matrices in the analysis. The GICA method extends the basic idea of the traditional term-document matrix analysis to include a third dimension of different individuals. This leads to a formation of a third-order tensor of size subjects × objects × contexts. Through flattening into a matrix, these subject-object-context (SOC) tensors can again be analyzed using various computational methods including principal component analysis (PCA), singular value decomposition (SVD), independent component analysis (ICA) or any existing or future method suitable for analyzing high-dimensional data sets. In order to demonstrate the use of the GICA method, we present the results of two case studies. In the first case, GICA of health-related concepts is conducted. In the second one, the State of the Union addresses by US presidents are analyzed. In these case studies, we apply multidimensional scaling (MDS), the self-organizing map (SOM) and Neighborhood Retrieval Visualizer (NeRV) as specific data analysis methods within the overall GICA method. The GICA method can be used, for instance, to support education of heterogeneous audiences, public planning processes and participatory design, conflict resolution, environmental problem solving, interprofessional and interdisciplinary communication, product development processes, mergers of organizations, and building enhanced knowledge representations in semantic web.

[1]  P. Kay,et al.  Focal colors are universal after all. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Timo Honkela,et al.  Concept Mining with Self-Organizing Maps for the Semantic Web , 2009, WSOM.

[3]  Timo Honkela,et al.  Self-Organizing Maps In Natural Language Processing , 1997 .

[4]  Timo Honkela,et al.  Von Foerster meets Kohonen-Approaches to Artificial Intelligence, Cognitive Science and Information Systems Development , 2004 .

[5]  Timo Honkela,et al.  Knowledge Practices, Epistemic Technologies, and Pragmatic Web , 2009, I-SEMANTICS.

[6]  Jarkko Venna,et al.  Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization , 2010, J. Mach. Learn. Res..

[7]  J. Lillo,et al.  Basic Color Terms , 2013 .

[8]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[9]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[10]  Giovanni Da San Martino Self-Organizing Maps in Natural Language Processing , 2003 .

[11]  Arto Mustajoki Modelling of (mis)communication , 2008 .

[12]  A. Strauss,et al.  The discovery of grounded theory: strategies for qualitative research aldine de gruyter , 1968 .

[13]  Timo Honkela,et al.  Simulating processes of concept formation and communication , 2008 .

[14]  P. Newcomer,et al.  Basic Color Terms , 1971, International Journal of American Linguistics.

[15]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[16]  Timo Honkela,et al.  Von Foerster meets Kohonen , 2005 .

[17]  Ben Shneiderman,et al.  Discovering interesting usage patterns in text collections: integrating text mining with visualization , 2007, CIKM '07.

[18]  Janyce Wiebe,et al.  Subjectivity Word Sense Disambiguation , 2009, EMNLP.

[19]  Steven J. Simske,et al.  On helmholtz's principle for documents processing , 2010, DocEng '10.