Binary trees for dissimilarity data

Binary segmentation procedures (in particular, classification and regression trees) are extended to study the relation between dissimilarity data and a set of explanatory variables. The proposed split criterion is very flexible, and can be applied to a wide range of data (e.g., mixed types of multiple responses, longitudinal data, sequence data). Also, it can be shown to be an extension of well-established criteria introduced in the literature on binary trees.

[1]  Roberta Siciliano,et al.  Multivariate data analysis and modeling through classification and regression trees , 2000 .

[2]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[3]  A. Abbott Sequence analysis: new methods for old ideas , 1995 .

[4]  Michael Anyadike-Danes,et al.  Predicting successful and unsuccessful transitions from school to work by using sequence methods , 2002 .

[5]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[6]  Cees H. Elzinga,et al.  Sequence Similarity , 2003 .

[7]  Joseph Sexton,et al.  Standard errors for bagged and random forest estimators , 2009, Comput. Stat. Data Anal..

[8]  Denis Larocque,et al.  Multivariate trees for mixed outcomes , 2009, Comput. Stat. Data Anal..

[9]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[10]  Pierpaolo D'Urso,et al.  Dissimilarity measures for time trajectories , 2000 .

[11]  Bernard L. Kovalchik,et al.  Forest habitat types of Montana. , 1977 .

[12]  Ji-Hyun Kim,et al.  Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap , 2009, Comput. Stat. Data Anal..

[13]  Paul E. Green,et al.  Multidimensional Scaling: Concepts and Applications , 1989 .

[14]  David W Roberts,et al.  Statistical analysis of multidimensional fuzzy set ordinations. , 2008, Ecology.

[15]  Constantino Arce,et al.  MULTIDIMENSIONAL SCALING: CONCEPT AND APPLICATIONS , 2010 .

[16]  J. Ware,et al.  Applied Longitudinal Analysis , 2004 .

[17]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[18]  Willem J. Heiser,et al.  Constrained Multidimensional Scaling, Including Confirmation , 1983 .

[19]  J. T. Curtis,et al.  An Ordination of the Upland Forest Communities of Southern Wisconsin , 1957 .

[20]  J. Farris,et al.  An Introduction to Numerical Classification , 1976 .

[21]  P. Speckman,et al.  Multivariate Regression Trees for Analysis of Abundance Data , 2004, Biometrics.

[22]  Maria Hewitt,et al.  Attitudes toward Interview Mode and Comparability of Reporting Sexual Behavior by Personal Interview and Audio Computer-assisted Self-interviewing , 2002 .

[23]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[24]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[25]  Heping Zhang Classification Trees for Multiple Binary Responses , 1998 .

[26]  Ahlame Douzal Chouakria,et al.  Adaptive dissimilarity index for measuring time series proximity , 2007, Adv. Data Anal. Classif..

[27]  Brian Everitt,et al.  Cluster analysis , 1974 .

[28]  R. Boyce,et al.  Choosing the best similarity index when performing fuzzy set ordination on binary data , 2001 .

[29]  Raffaella Piccarreta,et al.  Clustering work and family trajectories by using a divisive algorithm , 2007 .

[30]  Gilles R. Ducharme,et al.  Computational Statistics and Data Analysis a Similarity Measure to Assess the Stability of Classification Trees , 2022 .

[31]  Cees H. Elzinga,et al.  Combinatorial Representations of Token Sequences , 2005, J. Classif..

[32]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[33]  A. Kappers,et al.  Analysis of haptic perception of materials by multidimensional scaling and physical measurements of roughness and compressibility. , 2006, Acta psychologica.

[34]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[35]  H. Kiers,et al.  Simultaneous classification and multidimensional scaling with external information , 2005 .

[36]  M. Segal Tree-Structured Methods for Longitudinal Data , 1992 .