Visualizations of Textual Data
暂无分享,去创建一个
Publisher Summary This chapter discusses uses of the correspondence analysis (CA) to visualize the profiles of a series of texts—such as literary texts, documents, and responses to open questions grouped into artificial texts. The chapter also discusses the textual data and meta-information in detail. Meta-information or meta-data is particularly abundant in the case of textual data. Meta-information is the information concerning a data matrix that does not appear in the matrix itself. This meta-information, which is relatively easy to formalize, is used routinely to check and clean files or to carry out consistency tests in processing survey data and in the context of information retrieval. The chapter discusses three approaches that can be used to get the main features of the differences between responses and texts without any need for preprocessing and precoding: (1) visualization of proximities between words and categories through the correspondence analysis, (2) selection of characteristic words, and (3) selection of modal responses. A brief description of lemmatized analyses, homography, disambiguation, and numeric coding of text is presented in the chapter. The chapter also describes the concepts of frequency threshold for words, grouping responses, correspondence analysis of the lexical table, and characteristic words and modal responses.