Required Sample Sizes for Data-Driven Market Segmentation Analyses in Tourism

Data analysts in industry and academia make heavy use of market segmentation analysis to develop tourism knowledge and select commercially attractive target segments. Within academic research alone, approximately 5% of published articles use market segmentation. However, the validity of data-driven market segmentation analyses depends on having available a sample of adequate size. Moreover, no guidance exists for determining what an adequate sample size is. In the present simulation study using artificial data of known structure, the impact of the difficulty of the segmentation task on the required sample size is analyzed in dependence of the number of variables in the segmentation base. Under all simulated data circumstances, a sample size of 70 times the number of variables proves to be adequate. This finding is of substantial practical importance because it will provide guidance to data analysts in academia and industry who wish to conduct reliable and valid segmentation studies.

[1]  Douglas Steinley,et al.  A New Variable Weighting and Selection Procedure for K-means Cluster Analysis , 2008, Multivariate behavioral research.

[2]  John R. Rossiter,et al.  The C-OAR-SE procedure for scale development in marketing , 2002 .

[3]  Colin J. B. Wood,et al.  Motivations and Normative Evaluations of Summer Visitors at an Alpine Ski Area , 2011 .

[4]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[5]  M. Brusco,et al.  A variable-selection heuristic for K-means clustering , 2001 .

[6]  John R. Rossiter,et al.  Measurement for the Social Sciences: The C-OAR-SE Method and Why It Must Replace Psychometrics , 2010 .

[7]  George Ostrouchov,et al.  Sampling Within k-Means Algorithm to Cluster Large Datasets , 2011 .

[8]  Patrick Legohérel,et al.  Market Segmentation in the Tourism Industry and Consumers' Spending , 2006 .

[9]  S. Dolnicar,et al.  Key drivers of airline loyalty , 2011, Tourism management.

[10]  D. Steinley Properties of the Hubert-Arabie adjusted Rand index. , 2004, Psychological methods.

[11]  Sara Dolnicar,et al.  A Review of Data-Driven Market Segmentation in Tourism , 2002 .

[12]  J. Nicolau,et al.  Finding similar price preferences on tourism activities , 2011 .

[13]  S. Wood mgcv:Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness estimation , 2012 .

[14]  Friedrich Leisch,et al.  Biclustering , 2012 .

[15]  Harry Timmermans,et al.  Inducing Heuristic Principles of Tourist Choice of Travel Mode: A Rule-Based Approach , 2003 .

[16]  M. Cugmas,et al.  On comparing partitions , 2015 .

[17]  L. Lawton,et al.  Visitor Loyalty at a Private South Carolina Protected Area , 2011 .

[18]  Sara Dolnicar,et al.  Methodological reasons for the theory/practice divide in market segmentation , 2009 .

[19]  Edward G. Carmines,et al.  Measurement in the social sciences , 1980 .

[20]  A. Tynan,et al.  Market Segmentation , 2018, Entrepreneurial Management Theory and Practice.

[21]  Ali Kara,et al.  HINoV: A New Model to Improve Market Segment Definition by Identifying Noisy Variables , 1999 .

[22]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[23]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[24]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[25]  Sara Dolnicar,et al.  Beyond “Commonsense Segmentation”: A Systematics of Segmentation Approaches in Tourism , 2004 .

[26]  Shai Ben-David,et al.  Stability of k -Means Clustering , 2007, COLT.

[27]  Friedrich Leisch,et al.  Evaluation of structure and reproducibility of cluster solutions using the bootstrap , 2010 .

[28]  Harry Joe,et al.  Generation of Random Clusters with Specified Degree of Separation , 2006, J. Classif..

[29]  Bettina Grün,et al.  Challenging “Factor–Cluster Segmentation” , 2008, Journal of Travel Research.