Topic models: a novel method for modeling couple and family text data.

Couple and family researchers often collect open-ended linguistic data-either through free-response questionnaire items, or transcripts of interviews or therapy sessions. Because participants' responses are not forced into a set number of categories, text-based data can be very rich and revealing of psychological processes. At the same time, it is highly unstructured and challenging to analyze. Within family psychology, analyzing text data typically means applying a coding system, which can quantify text data but also has several limitations, including the time needed for coding, difficulties with interrater reliability, and defining a priori what should be coded. The current article presents an alternative method for analyzing text data called topic models (Steyvers & Griffiths, 2006), which has not yet been applied within couple and family psychology. Topic models have similarities to factor analysis and cluster analysis in that they identify underlying clusters of words with semantic similarities (i.e., the "topics"). In the present article, a nontechnical introduction to topic models is provided, highlighting how these models can be used for text exploration and indexing (e.g., quickly locating text passages that share semantic meaning) and how output from topic models can be used to predict behavioral codes or other types of outcomes. Throughout the article, a collection of transcripts from a large couple-therapy trial (Christensen et al., 2004) is used as example data to highlight potential applications. Practical resources for learning more about topic models and how to apply them are discussed.

[1]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[2]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[3]  M. Knapp,et al.  Couples' personal idioms: exploring intimate talk. , 1981, The Journal of communication.

[4]  G. Birchler Marital Therapy: Strategies Based on Social Learning and Behavior Exchange Principles , 1981 .

[5]  N. Jacobson,et al.  Marital Therapy , 1986 .

[6]  J. Gottman,et al.  Marital interaction and satisfaction: a longitudinal view. , 1989, Journal of consulting and clinical psychology.

[7]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[8]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[9]  N. Jacobson,et al.  Acceptance and Change in Couple Therapy: A Therapist's Guide to Transforming Relationships , 1998 .

[10]  R. Heyman,et al.  Observation of couple conflicts: clinical assessment applications, stubborn truths, and shaky foundations. , 2001, Psychological assessment.

[11]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  David C. Atkins,et al.  Traditional versus integrative behavioral couple therapy for significantly and chronically distressed married couples. , 2004, Journal of consulting and clinical psychology.

[14]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[15]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[16]  Patricia K. Kerig,et al.  Couple observational coding systems , 2004 .

[17]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  David C. Atkins,et al.  Couple and individual adjustment for 2 years following a randomized clinical trial comparing traditional versus integrative behavioral couple therapy. , 2006, Journal of consulting and clinical psychology.

[20]  B. Tabachnick,et al.  Using multivariate statistics, 5th ed. , 2007 .

[21]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[22]  Thomas L. Griffiths,et al.  Probabilistic Topic Models , 2007 .

[23]  Acceptance and Change , 2007 .

[24]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[25]  Andrew Christensen,et al.  Observed communication and associations with satisfaction during traditional and integrative behavioral couple therapy. , 2008, Behavior therapy.

[26]  Kurt Hornik,et al.  Text Mining Infrastructure in R , 2008 .

[27]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[28]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[29]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[30]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[31]  David C. Atkins,et al.  Marital status and satisfaction five years following a randomized clinical trial comparing traditional versus integrative behavioral couple therapy. , 2010, Journal of consulting and clinical psychology.

[32]  David M. Blei,et al.  Introduction to Probabilistic Topic Models , 2010 .

[33]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[34]  Kurt Hornik,et al.  topicmodels : An R Package for Fitting Topic Models , 2016 .

[35]  Timothy N. Rubin,et al.  Statistical topic models for multi-label document classification , 2011, Machine Learning.

[36]  Panayiotis G. Georgiou,et al.  "That's Aggravating, Very Aggravating": Is It Possible to Classify Behaviors in Couple Interactions Using Automatically Derived Lexical Features? , 2011, ACII.

[37]  F. Ramseyer,et al.  Nonverbal synchrony in psychotherapy: coordinated body movement reflects relationship quality and outcome. , 2011, Journal of consulting and clinical psychology.

[38]  Athanasios Katsamanis,et al.  Toward automating a human behavioral coding system for married couples' interactions using speech acoustic features , 2013, Speech Commun..