Data mining in psychological treatment research: a primer on classification and regression trees.

Data mining of treatment study results can reveal unforeseen but critical insights, such as who receives the most benefit from treatment and under what circumstances. The usefulness and legitimacy of exploratory data analysis have received relatively little recognition, however, and analytic methods well suited to the task are not widely known in psychology. With roots in computer science and statistics, statistical learning approaches offer a credible option: These methods take a more inductive approach to building a model than is done in traditional regression, allowing the data greater role in suggesting the correct relationships between variables rather than imposing them a priori. Classification and regression trees are presented as a powerful, flexible exemplar of statistical learning methods. Trees allow researchers to efficiently identify useful predictors of an outcome and discover interactions between predictors without the need to anticipate and specify these in advance, making them ideal for revealing patterns that inform hypotheses about treatment effects. Trees can also provide a predictive model for forecasting outcomes as an aid to clinical decision making. This primer describes how tree models are constructed, how the results are interpreted and evaluated, and how trees overcome some of the complexities of traditional regression. Examples are drawn from randomized clinical trial data and highlight some interpretations of particular interest to treatment researchers. The limitations of tree models are discussed, and suggestions for further reading and choices in software are offered.

[1]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[2]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[3]  P. Harper,et al.  A review and comparison of classification algorithms for medical decision making. , 2005, Health policy.

[4]  Shireen L. Rizvi,et al.  Cognitive and affective predictors of treatment outcome in Cognitive Processing Therapy and Prolonged Exposure for posttraumatic stress disorder. , 2009, Behaviour research and therapy.

[5]  P. Resick,et al.  Cognitive Processing Therapy for Rape Victims: A Treatment Manual , 1993 .

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[8]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[9]  E. Ellis Treating the Trauma of Rape: Cognitive Behavioral Therapy for PTSD , 1999 .

[10]  D. Charney,et al.  The development of a Clinician-Administered PTSD Scale , 1995, Journal of traumatic stress.

[11]  W. Loh,et al.  Improving the precision of classification trees , 2010, 1011.0608.

[12]  William R Shadish,et al.  Propensity Scores , 2005, Evaluation review.

[13]  Stephen G West,et al.  Doctoral training in statistics, measurement, and methodology in psychology: replication and extension of Aiken, West, Sechrest, and Reno's (1990) survey of PhD programs in North America. , 2008, The American psychologist.

[14]  P. Nishith,et al.  A comparison of cognitive-processing therapy with prolonged exposure and a waiting condition for the treatment of chronic posttraumatic stress disorder in female rape victims. , 2002, Journal of consulting and clinical psychology.

[15]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[16]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[17]  H. Kraemer,et al.  Mediators and moderators of treatment effects in randomized clinical trials. , 2002, Archives of general psychiatry.

[18]  Achim Zeileis,et al.  A Toolkit for Recursive Partytioning , 2015 .

[19]  Carolin Strobl,et al.  Unbiased split selection for classification trees based on the Gini Index , 2007, Comput. Stat. Data Anal..

[20]  T. Therneau,et al.  An Introduction to Recursive Partitioning Using the RPART Routines , 2015 .

[21]  Victoria A. Shaffer,et al.  Binary recursive partitioning: background, methods, and application to psychology. , 2011, The British journal of mathematical and statistical psychology.

[22]  J. Maindonald Statistical Learning from a Regression Perspective , 2008 .

[23]  W. Loh,et al.  REGRESSION TREES WITH UNBIASED VARIABLE SELECTION AND INTERACTION DETECTION , 2002 .

[24]  B. Everitt,et al.  A Handbook of Statistical Analyses using R , 2006 .

[25]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[26]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[27]  R. Bradley,et al.  A multidimensional meta-analysis of psychotherapy for PTSD. , 2005, The American journal of psychiatry.