Analysing a quality‐of‐life survey by using a coclustering model for ordinal data and some dynamic implications

The dataset that motivated this work is a psychological survey on women affected by a breast tumour. Patients replied at different stages of their treatment to ques- tionnaires with answers on an ordinal scale. The questions relate to aspects of their life referred to as “dimensions”. To assist psychologists in analysing the results, it is useful to highlight the structure of the dataset. The clustering method achieves this by creating groups of individuals that are depicted by a representative of the group. From a psycho- logical position, it is also useful to observe how questions may be clustered. The simulta- neous clustering of both patients and questions is called “co-clustering”. However, placing questions in the same group when they are not related to the same dimension does not make sense from a psychological perspective. Therefore, constrained co-clustering was performed to prevent questions of different dimensions from being placed in the same column-cluster. The evolution of co-clusters over time was then investigated. The method uses a constrained Latent Block Model embedding a probability distribution for ordinal data. Parameter estimation relies on a stochastic EM algorithm associated with a Gibbs sampler, and the ICL-BIC criterion is used to select the number of co-clusters.

[1]  Gérard Govaert,et al.  Estimation d'un modèle à blocs latents par l'algorithme SEM , 2010 .

[2]  Julien Jacques,et al.  Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm , 2016, Stat. Comput..

[3]  T. Robbins,et al.  Heterogeneity of Parkinson’s disease in the early clinical stages using a data driven approach , 2005, Journal of Neurology, Neurosurgery & Psychiatry.

[4]  S. Zarit,et al.  Dimensions of Social Support and Social Conflict as Predictors of Caregiver Depression , 1995, International Psychogeriatrics.

[5]  I. Sarason,et al.  Assessing Social Support: The Social Support Questionnaire. , 1983 .

[6]  J. Vermunt,et al.  Latent Gold 4.0 User's Guide , 2005 .

[7]  Gérard Govaert,et al.  Co-Clustering: Models, Algorithms and Applications , 2013 .

[8]  Gérard Govaert,et al.  Clustering with block mixture models , 2003, Pattern Recognit..

[9]  Damien McParland,et al.  Clustering Ordinal Data via Latent Variable Models , 2013, Algorithms from and for Nature and Life.

[10]  Julien Jacques,et al.  Model-based co-clustering for ordinal data , 2017, Comput. Stat. Data Anal..

[11]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[12]  Ivy Liu,et al.  Analysis of Ordinal Categorical Data, 2nd edn by Alan Agresti , 2011 .

[13]  R. Snaith,et al.  The Hospital Anxiety and Depression Scale , 1983 .

[14]  V. Robert,et al.  Classification croisée pour l’analyse de bases de données de grandes dimensions de pharmacovigilance , 2017 .

[15]  Gérard Govaert,et al.  Estimation and selection for the latent block model on categorical data , 2015, Stat. Comput..

[16]  I. Sarason,et al.  Assessing the Quality of Personal Relationships , 1997 .

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  D. Osoba,et al.  The European Organization for Research and Treatment of Cancer QLQ-C30: a quality-of-life instrument for use in international clinical trials in oncology. , 1993, Journal of the National Cancer Institute.

[19]  Alfred Ultsch,et al.  Algorithms from and for Nature and Life - Classification and Data Analysis , 2013, Studies in Classification, Data Analysis, and Knowledge Organization.

[20]  Gérard Govaert,et al.  An EM algorithm for the block mixture model , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  E. R. van den Heuvel,et al.  Trajectories of anxiety and depression in liver transplant candidates during the waiting-list period. , 2017, British journal of health psychology.

[22]  A. Agresti Analysis of Ordinal Categorical Data: Agresti/Analysis , 2010 .

[23]  Claire Cardie,et al.  Constrained K-means Clustering with Background Knowledge , 2001, ICML.

[24]  Évolution du contrôle religieux la première année suivant l’annonce d’un cancer du sein : quels liens avec les stratégies de coping, l’anxiété, la dépression et la qualité de vie ? , 2014 .

[25]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  M. Giordan,et al.  A Clustering Method for Categorical Ordinal Data , 2011 .

[27]  Monia Ranalli,et al.  Mixture models for ordinal data: a pairwise likelihood approach , 2014, Statistics and Computing.

[28]  Marcella Corduas A statistical procedure for clustering ordinal data , 2008 .

[29]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[30]  Angela D'Elia,et al.  A mixture model for preferences data analysis , 2005, Comput. Stat. Data Anal..

[31]  G. Schwarz Estimating the Dimension of a Model , 1978 .