Compatibility between Text Mining and Qualitative Research in the Perspectives of Grounded Theory, Content Analysis, and Reliability

The objective of this article is to illustrate that text mining and qualitative research are epistemologically compatible. First, like many qualitative research approaches, such as grounded theory, text mining encourages open-mindedness and discourages preconceptions. Contrary to the popular belief that text mining is a linear and fully automated procedure, the text miner might add, delete, and revise the initial categories in an iterative fashion. Second, text mining is similar to content analysis, which also aims to extract common themes and threads by counting words. Although both of them utilize computer algorithms, text mining is characterized by its capability of processing natural languages. Last, the criteria of sound text mining adhere to those in qualitative research in terms of consistency and replicability. Key Words: Text Mining, Content Analysis, Exploratory Data Analysis, Natural Language Processing, Computational Linguistics, Grounded Theory, Reliability, and Validity

[1]  Adrianne Kunkel,et al.  The Qualitative Report , 2013 .

[2]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[3]  James A. Evans,et al.  Novel opportunities for computational biology and sociology in drug discovery. , 2010, Trends in biotechnology.

[4]  Nitin Indurkhya,et al.  Handbook of Natural Language Processing , 2010 .

[5]  Andrey Rzhetsky,et al.  Novel opportunities for computational biology and sociology in drug discovery. , 2009, Trends in biotechnology.

[6]  D. Consoli Analysing Customer Opinions with Text Mining Algorithms , 2009 .

[7]  M. Romanowski What You Don't Know Can Hurt You: Textbook Omissions and 9/11 , 2009 .

[8]  J. Tsai Chinese Immigrant Restaurant Workers' Injury and Illness Experiences , 2009, Archives of environmental & occupational health.

[9]  S G P Vellay,et al.  Interactive text mining with Pipeline Pilot: a bibliographic web-based tool for PubMed. , 2009, Infectious disorders drug targets.

[10]  K. Bretonnel Cohen,et al.  U-Compare: share and compare text mining tools with UIMA , 2009, Bioinform..

[11]  Chao-Fu Hong Qualitative Chance Discovery - Extracting competitive advantages , 2009, Inf. Sci..

[12]  Fu-Ren Lin,et al.  Discovering genres of online discussion threads via text mining , 2009, Comput. Educ..

[13]  Sam Zaremba,et al.  Text-mining of PubMed abstracts by natural language processing to create a public knowledge base on molecular mechanisms of bacterial enteropathogens , 2009, BMC Bioinformatics.

[14]  David T. Jones,et al.  Improving classification in protein structure databases using text mining , 2009, BMC Bioinformatics.

[15]  Michael Schroeder,et al.  Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies? , 2008, Briefings Bioinform..

[16]  Chun-Hua Chen,et al.  Developing an Intelligent Diagnosis and Assessment E-learning Tool for Introductory Programming , 2008, J. Educ. Technol. Soc..

[17]  Tricia Vilkinas An Exploratory Study of the Supervision of Ph.D./Research Students’ Theses , 2008 .

[18]  Alexander Mehler,et al.  Aspects of Automatic Text Analysis , 2010, Studies in Fuzziness and Soft Computing.

[19]  W. Scott Spangler,et al.  Intelligent Web Services Selection based on AHP and Wiki , 2007, IEEE/WIC/ACM International Conference on Web Intelligence (WI'07).

[20]  Clark Hu,et al.  Text mining a decade of progress in hospitality human resource management research: identifying emerging thematic development. , 2007 .

[21]  J. Grin,et al.  Validity and Reliability of Qualitative Data Analysis: Interobserver Agreement in Reconstructing Interpretative Frames , 2007 .

[22]  Stephen A Morse,et al.  The Seminal Literature of Anthrax Research , 2007, Critical reviews in microbiology.

[23]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[24]  Anne Kao,et al.  Natural Language Processing and Text Mining , 2006 .

[25]  Kinshuk,et al.  Mining e-Learning Domain Concept Map from Academic Articles , 2006, Sixth IEEE International Conference on Advanced Learning Technologies (ICALT'06).

[26]  Dustin Johnson,et al.  Duplicate publication and ‘paper inflation’ in the fractals literature , 2006, Science and engineering ethics.

[27]  Graham R. Gibbs Concordances and semi-automatic coding in qualitative analysis: possibilities and barriers , 2006 .

[28]  Chong Ho Yu,et al.  Philosophical Foundations of Quantitative Research Methodology , 2006 .

[29]  Ronald N. Kostoff,et al.  Text Mining the Global Abrupt-Wing-Stall Literature , 2005 .

[30]  William R. Hersh,et al.  A survey of current work in biomedical text mining , 2005, Briefings Bioinform..

[31]  Furio Camillo,et al.  Semiometric Approach, Qualitative Research and Text Mining Techniques for Modelling the Material Culture of Happiness , 2005 .

[32]  Ronald N. Kostoff,et al.  Information content in Medline record fields , 2004, Int. J. Medical Informatics.

[33]  Thomas W. Miller Data and Text Mining: A Business Applications Approach , 2004 .

[34]  Héctor D. Cortés,et al.  Macromolecule mass spectrometry: Citation mining of user documents , 2004, Journal of the American Society for Mass Spectrometry.

[35]  Clark Hu,et al.  Analyzing Hotel Customers' E-Complaints from an Internet Complaint Forum , 2004 .

[36]  M. Schlick,et al.  Positivism and realism , 2004, Synthese.

[37]  T. Sheldon,et al.  Increasing the visibility of coding decisions in team-based qualitative research in nursing. , 2004, International journal of nursing studies.

[38]  Nahid Golafshani,et al.  Understanding Reliability and Validity in Qualitative Research , 2003 .

[39]  N. Hanley,et al.  Validity and reliability , 2002 .

[40]  J. Morse,et al.  Verification Strategies for Establishing Reliability and Validity in Qualitative Research , 2002 .

[41]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[42]  M. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[43]  Kimberly A. Neuendorf,et al.  The Content Analysis Guidebook , 2001 .

[44]  R N Kostoff,et al.  Extracting information from the literature by text mining. , 2001, Analytical chemistry.

[45]  Robert Dale,et al.  Handbook of Natural Language Processing , 2001, Computational Linguistics.

[46]  B. Beckman,et al.  BizTalk Server 2000 Business Process Orchestration. , 2001 .

[47]  M. Johnson,et al.  Rigour, reliability and validity in qualitative research , 2000 .

[48]  Charles P. Smith Content analysis and narrative analysis. , 2000 .

[49]  Harrison Si,et al.  Handbook of Research Methods in Social and Personality Psychology: Author Index , 2013 .

[50]  Johann Van Staden,et al.  Objectivity, reliability and validity in qualitative research , 2000 .

[51]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[52]  Janice M. Morse,et al.  Myth #93: Reliability and Validity Are Not Relevant to Qualitative Inquiry , 1999 .

[53]  Marti A. Hearst Untangling Text Data Mining , 1999, ACL.

[54]  Janice M. Morse,et al.  "Perfectly Healthy, but Dead": The Myth of Inter-Rater Reliability , 1997 .

[55]  T. Marteau,et al.  The Place of Inter-Rater Reliability in Qualitative Research: An Empirical Study , 1997 .

[56]  James W. Carey,et al.  Intercoder Agreement in Analysis of Responses to Open-Ended Interview Questions: Examples from Tuberculosis Research , 1996 .

[57]  K. Glenk,et al.  Validity and Reliability , 2008, Environmental Valuation with Discrete Choice Experiments.

[58]  Matthew B. Miles,et al.  Qualitative Data Analysis: An Expanded Sourcebook , 1994 .

[59]  David L. Altheide,et al.  Criteria for assessing interpretive validity in qualitative research. , 1994 .

[60]  B. Glaser Basics of Grounded Theory Analysis: Emergence Vs. Forcing , 1992 .

[61]  P. Stevens,et al.  Rigor in feminist research , 1991, ANS. Advances in nursing science.

[62]  G. Lindzey,et al.  Theories of Personality , 1958 .

[63]  E. Guba,et al.  Fourth Generation Evaluation , 1989 .

[64]  Leslie C Carlson,et al.  Using Motive Scores in the Psychobiographical Study of an Individual: The Case of Richard Nixon , 1988 .

[65]  J. Kirk,et al.  Reliability and Validity in Qualitative Research , 1985 .

[66]  P. Schnurr,et al.  Linguistic dimensions of affect and thought in somatization disorder. , 1985, The American journal of psychiatry.

[67]  L. Lynn Effective evaluation: Improving the usefulness of evaluation results through responsive and naturalistic approaches: by Egon G. Guba and Yvonna S. Lincoln. San Francisco: Jossey-Bass, 1981, 423 pp., $17.95 (hardcover). , 1983 .

[68]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[69]  Robert L. Russell,et al.  Categories for classifying language in psychotherapy. , 1979, Psychological bulletin.

[70]  R. Bales Interaction process analysis , 1976 .

[71]  M. Schlick General theory of knowledge , 1974 .

[72]  Richard Schuldenfrei Quine In Perspective , 1972 .

[73]  Harold D. Lasswell,et al.  Language Of Politics: Studies In Quantitative Semantics , 1968 .

[74]  A. Strauss,et al.  The discovery of grounded theory: strategies for qualitative research aldine de gruyter , 1968 .

[75]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[76]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[77]  C. Hall,et al.  The content analysis of dreams , 1966 .

[78]  H. Blalock Causal Inferences in Nonexperimental Research , 1966 .

[79]  R. Carnap Logical Syntax of Language , 1937 .

[80]  G. Āllport,et al.  Trait-names: A psycho-lexical study. , 1936 .