An ensemble of ordered logistic regression and random forest for child garment size matching

A fit assessment study of child garments is carried out.An original ensemble of ordered logistic regression and random forest for garment size assignment is proposed.Two new measures for understanding data with multivariate random forest are proposed.Promising results have been achieved with our methodology, especially if they are compared with the actual use of size charts. Size fitting is a significant problem for online garment shops. The return rates due to size misfit are very high. We propose an ensemble (with an original and novel definition of the weights) of ordered logistic regression and random forest (RF) for solving the size matching problem, where ordinal data should be classified. These two classifiers are good candidates for combined use due to their complementary characteristics. A multivariate response (an ordered factor and a numeric value assessing the fit) was considered with a conditional random forest. A fit assessment study was carried out with 113 children. They were measured using a 3D body scanner to obtain their anthropometric measurements. Children tested different garments of different sizes, and their fit was assessed by an expert. Promising results have been achieved with our methodology. Two new measures have been introduced based on RF with multivariate responses to gain a better understanding of the data. One of them is an intervention in prediction measure defined locally and globally. It is shown that it is a good alternative to variable importance measures and it can be used for new observations and with multivariate responses. The other proposed tool informs us about the typicality of a case and allows us to determine archetypical observations in each class.

[1]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[2]  G. Yule On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .

[3]  Claus Weihs,et al.  klaR Analyzing German Business Cycles , 2005, Data Analysis and Decision Support.

[4]  Debashis Ghosh,et al.  Estimating controlled direct effects of restrictive feeding practices in the ‘Early dieting in girls’ study , 2016, Journal of the Royal Statistical Society. Series C, Applied statistics.

[5]  Yongsheng Ding,et al.  Multi-criteria decision making approach based on immune co-evolutionary algorithm with application to garment matching problem , 2011, Expert Syst. Appl..

[6]  Michael J. Procopio,et al.  An experimental analysis of classifier ensembles for learning drifting concepts over time in autonomous outdoor robot navigation , 2007 .

[7]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[8]  Sandra Eneh Showroom the Future of Online Fashion Retailing 2.0 : Enhancing the online shopping experience , 2015 .

[9]  Guillermo Ayala,et al.  Looking for representative fit models for apparel sizing , 2014, Decis. Support Syst..

[10]  Catherine Black,et al.  An assessment of fit and sizing of men's business clothing , 2011 .

[11]  Jiawei Han,et al.  Clustered Support Vector Machines , 2013, AISTATS.

[12]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[13]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[14]  Sandra Alemany,et al.  Archetypoids: A new approach to define representative archetypal data , 2015, Comput. Stat. Data Anal..

[15]  Guillermo Ayala,et al.  Modeling of female human body shapes for apparel design based on cross mean sets , 2014, Expert Syst. Appl..

[16]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[17]  Choi Hei-Sun,et al.  A Study of the Apparel Sizing of Children's Wear - An Analysis of the Size Increments Utilized in Children’s Wear Based on an Anthropometric Survey - , 2001 .

[18]  Nadia Magnenat-Thalmann,et al.  Made-to-Measure Technologies for an Online Clothing Store , 2003, IEEE Computer Graphics and Applications.

[19]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[20]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[21]  Susan P. Ashdown Creation of ready-made clothing: the development and future of sizing systems , 2014 .

[22]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[23]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[24]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[25]  Pierre Meunier Use of Body Shape Information in Clothing Size Selection , 2000 .

[26]  Lazaros G. Papageorgiou,et al.  Sample re-weighting hyper box classifier for multi-class data classification , 2015, Comput. Ind. Eng..

[27]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[29]  Kathleen M. Robinette,et al.  Sustainable Sizing , 2016, Hum. Factors.

[30]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[31]  Guillermo Ayala,et al.  Apparel sizing using trimmed PAM and OWA operators , 2012, Expert Syst. Appl..

[32]  Terry Harris,et al.  Credit scoring using the clustered support vector machine , 2015, Expert Syst. Appl..

[33]  A. Pierola,et al.  Child t-shirt size data set from 3D body scanner anthropometric measurements and a questionnaire , 2017, Data in brief.

[34]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[35]  Sharad Goel,et al.  HORSESHOES IN MULTIDIMENSIONAL SCALING AND LOCAL KERNEL METHODS , 2008, 0811.1477.

[36]  János Podani,et al.  RESEMBLANCE COEFFICIENTS AND THE HORSESHOE EFFECT IN PRINCIPAL COORDINATES ANALYSIS , 2002 .

[37]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[38]  G. Lea‐Greenwood,et al.  The unhappy shopper, a retail experience: exploring fashion, fit and affordability , 2005 .

[39]  Anne Ruiz,et al.  Storms prediction : Logistic regression vs random forest for unbalanced data , 2007, 0804.0650.

[40]  Ioannis Hatzilygeroudis,et al.  Recognizing emotions in text using ensemble of classifiers , 2016, Eng. Appl. Artif. Intell..

[41]  Eduardo Parrilla,et al.  Low-Cost Data-Driven 3D Reconstruction and its Applications , 2015 .

[42]  R. Michael Alvarez,et al.  Using Machine Learning Algorithms to Detect Election Fraud , 2016, Computational Social Science.

[43]  G. Yule,et al.  On the association of attributes in statistics, with examples from the material of the childhood society, &c , 1900, Proceedings of the Royal Society of London.

[44]  Karen L. LaBat,et al.  Exploring the Relationships of Grading, Sizing, and Anthropometric Data , 2005 .

[45]  M. Kendall Rank Correlation Methods , 1949 .

[46]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[47]  K. L. Labat 3 – Sizing standardization , 2007 .

[48]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[49]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[50]  Irene Epifanio,et al.  h‐plots for displaying nonmetric dissimilarity matrices , 2013, Stat. Anal. Data Min..

[51]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.