Prediction-Oriented Model Selection in Partial Least Squares Path Modeling

Partial least squares path modeling (PLS-PM) has become popular in various disciplines to model structural relationships among latent variables measured by manifest variables. To fully benefit from the predictive capabilities of PLS-PM, researchers must understand the efficacy of predictive metrics used. In this research, we compare the performance of standard PLS-PM criteria and model selection criteria derived from Information Theory, in terms of selecting the best predictive model among a cohort of competing models. We use Monte Carlo simulation to study this question under various sample sizes, effect sizes, item loadings, and model setups. Specifically, we explore whether, and when, the in-sample measures such as the model selection criteria can substitute for out-of-sample criteria that require a holdout sample. Such a substitution is advantageous when creating a holdout causes considerable loss of statistical and predictive power due to an overall small sample. We find that when the researcher does not have the luxury of a holdout sample, and the goal is selecting correctly specified models with low prediction error, the in-sample model selection criteria, in particular the Bayesian Information Criterion (BIC) and Geweke-Meese Criterion (GM), are useful substitutes for out-of-sample criteria. When a holdout sample is available, the best performing out-of-sample criteria include the root mean squared error (RMSE) and mean absolute deviation (MAD). Finally, we recommend against using standard the PLS-PM criteria (R, Adjusted R, and Q), and specifically the out-of-sample mean absolute percentage error (MAPE) for prediction-oriented model selection purposes. Finally, we illustrate the model selection criteria’s practical utility using a well-known corporate reputation model.

[1]  Rudolf R. Sinkovics,et al.  A Critical Look at the Use of SEM in International Business Research , 2014 .

[2]  Detmar W. Straub,et al.  Common Beliefs and Reality About Partial Least Squares: Comments on Rönkkö & Evermann (2013) , 2014 .

[3]  D. Straub,et al.  Editor's comments: a critical look at the use of PLS-SEM in MIS quarterly , 2012 .

[4]  P M Bentler,et al.  Choice of structural model via parsimony: a rationale based on precision. , 1989, Psychological bulletin.

[5]  Manfred Schwaiger Components and Parameters of Corporate Reputation — An Empirical Study , 2004 .

[6]  Shirley Gregor,et al.  The Nature of Theory in Information Systems , 2006, MIS Q..

[7]  Mikko Rönkkö,et al.  A Critical Examination of Common Beliefs About Partial Least Squares Path Modeling , 2013 .

[8]  Marko Sarstedt,et al.  Goodness-of-fit indices for partial least squares path modeling , 2013, Comput. Stat..

[9]  Benito E. Flores,et al.  A pragmatic view of accuracy measurement in forecasting , 1986 .

[10]  Stacie Petter,et al.  On the use of partial least squares path modeling in accounting research , 2011, Int. J. Account. Inf. Syst..

[11]  Ramiro Montealegre,et al.  Information Technology Use as a Learning Mechanism: The Impact of IT Use on Knowledge Transfer Effectiveness, Absorptive Capacity, and Franchisee Performance , 2015, MIS Q..

[12]  Mary Tate,et al.  Assessing the predictive performance of structural equation model estimators , 2016 .

[13]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[14]  H. Wold Causal flows with latent variables: Partings of the ways in the light of NIPALS modelling , 1974 .

[15]  Alain Yee-Loong Chong,et al.  An updated and expanded assessment of PLS-SEM in information systems research , 2017, Ind. Manag. Data Syst..

[16]  Insu Park,et al.  Disaster Experience and Hospital Information Systems: An Examination of Perceived Information Assurance, Risk, Resilience, and HIS Usefulness , 2015, MIS Q..

[17]  Jeffrey R. Edwards,et al.  Reflections on Partial Least Squares Path Modeling , 2014 .

[18]  Marko Sarstedt,et al.  Mirror, mirror on the wall: a comparative evaluation of composite-based structural equation modeling methods , 2017, Journal of the Academy of Marketing Science.

[19]  Thorsten Hennig-Thurau,et al.  Identifying Hidden Structures in Marketing's Structural Models Through Universal Structure Modeling An Explorative Bayesian Neural Network Complement to LISREL and PLS , 2008 .

[20]  Alan Dick,et al.  Customer loyalty: Toward an integrated conceptual framework , 1994 .

[21]  ShmueliGalit,et al.  Predictive analytics in information systems research , 2011 .

[22]  Marko Sarstedt,et al.  Structural modeling of heterogeneous data with partial least squares , 2010 .

[23]  P. Coelho,et al.  Likelihood and PLS Estimators for Structural Equation Modeling: An Assessment of Sample Size, Skewness and Model Misspecification Effects , 2013 .

[24]  Detmar W. Straub,et al.  Common Beliefs and Reality About PLS , 2014 .

[25]  William Rand,et al.  Building Agent-Based Decision Support Systems for Word-of-Mouth Programs: A Freemium Application , 2016 .

[26]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[27]  R. Fildes,et al.  Measuring forecasting accuracy : the case of judgmental adjustments to SKU-level demand forecasts , 2013 .

[28]  Manfred Schwaiger,et al.  The effects of corporate reputation perceptions of the general public on shareholder value , 2015 .

[29]  Pratyush Nidhi Sharma,et al.  Model Selection in Information Systems Research Using Partial Least Squares Based Structural Equation Modeling , 2012, ICIS.

[30]  Chih-Ling Tsai,et al.  Regression model selection—a residual likelihood approach , 2002 .

[31]  Friedrich Leisch,et al.  semPLS: Structural Equation Modeling Using Partial Least Squares , 2012 .

[32]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[33]  A. Tenenhaus,et al.  Regularized Generalized Canonical Correlation Analysis , 2011, Eur. J. Oper. Res..

[34]  Asil Oztekin,et al.  The Impact of Subjective and Objective Experience on Mobile Banking Usage: An Analytical Approach , 2017, HICSS.

[35]  Herman Wold,et al.  Model Construction and Evaluation When Theoretical Knowledge Is Scarce , 1980 .

[36]  S. Geisser A predictive approach to the random effect model , 1974 .

[37]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[38]  Gianfranco Walsh,et al.  Investigating mediators between corporate reputation and customer citizenship behaviors , 2011 .

[39]  Elena Karahanna,et al.  Shackled to the Status Quo: The Inhibiting Effects of Incumbent System Habit, Switching Costs, and Inertia on New System Acceptance , 2012, MIS Q..

[40]  M. Sarstedt,et al.  Treating unobserved heterogeneity in PLS path modeling: a comparison of FIMIX-PLS with different data analysis strategies , 2010 .

[41]  Nicole Franziska Richter,et al.  Causal analysis of the internationalization and performance relationship based on neural networks -- advocating the transnational structure , 2009 .

[42]  Marko Sarstedt,et al.  Genetic algorithm segmentation in partial least squares structural equation modeling , 2013, OR Spectrum.

[43]  R. Oliver Cognitive, affective, and attribute bases of the satisfaction response. , 1993 .

[44]  Chris Chatfield,et al.  Time series forecasting with neural networks: a comparative study using the air line data , 2008 .

[45]  Gianfranco Walsh,et al.  The customer-based corporate reputation scale: replication and short form , 2009 .

[46]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[47]  Marko Sarstedt,et al.  Partial least squares structural equation modeling (PLS-SEM): An emerging tool in business research , 2014 .

[48]  Jörg Henseler,et al.  Consistent Partial Least Squares Path Modeling , 2015, MIS Q..

[49]  Edward E. Rigdon,et al.  Choosing PLS path modeling as analytical method in European management research: A realist perspective , 2016 .

[50]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[51]  Markus Eberl,et al.  An Application of PLS in Multi-Group Analysis: The Need for Differentiated Corporate-Level Marketing in the Mobile Communications Industry , 2010 .

[52]  A. McQuarrie,et al.  Regression and Time Series Model Selection , 1998 .

[53]  Galit Shmueli,et al.  The elephant in the room: Predictive performance of PLS models , 2016 .

[54]  I. J. Myung,et al.  The Importance of Complexity in Model Selection. , 2000, Journal of mathematical psychology.

[55]  Barry J. Babin,et al.  Publishing Research in Marketing Journals Using Structural Equation Modeling , 2008 .

[56]  Marko Sarstedt,et al.  Identifying and treating unobserved heterogeneity with FIMIX-PLS: Part II – A case study , 2016 .

[57]  Shalabh Statistical Learning from a Regression Perspective , 2009 .

[58]  Yao Chen,et al.  FLAS: Fuzzy lung allocation system for US-based transplantations , 2016, Eur. J. Oper. Res..

[59]  Asil Oztekin,et al.  A causal analytic approach to student satisfaction index modeling , 2016, Annals of Operations Research.

[60]  Fujun Lai,et al.  Using Partial Least Squares in Operations Management Research: A Practical Guideline and Summary of Past Research , 2012 .

[61]  Chris Tofallis,et al.  A better measure of relative prediction accuracy for model selection and model estimation , 2014, J. Oper. Res. Soc..

[62]  J. Kuha AIC and BIC , 2004 .

[63]  Hans Baumgartner,et al.  On the use of structural equation models for marketing modeling , 2000 .

[64]  Arun Rai,et al.  Predictive Validity and Formative Measurement in Structural Equation Modeling: Embracing Practical Relevance , 2013, ICIS.

[65]  Gordon B. Davis,et al.  User Acceptance of Information Technology: Toward a Unified View , 2003, MIS Q..

[66]  Kenneth A. Bollen,et al.  Monte Carlo Experiments: Design and Implementation , 2001 .

[67]  V. E. Vinzi,et al.  A global Goodness – of – Fit index for PLS structural equation modelling 1 , 2004 .

[68]  Joseph F. Hair,et al.  Partial Least Squares Structural Equation Modeling , 2021, Handbook of Market Research.

[69]  P. Goodwin,et al.  On the asymmetry of the symmetric MAPE , 1999 .

[70]  Wynne W. Chin The partial least squares approach for structural equation modeling. , 1998 .

[71]  Geoffrey S. Hubona,et al.  Using PLS path modeling in new technology research: updated guidelines , 2016, Ind. Manag. Data Syst..

[72]  J. Faraway,et al.  Time series forecasting with neural networks: a comparative study using the air line data , 2008 .

[73]  Marko Sarstedt,et al.  Advanced Issues in Partial Least Squares Structural Equation Modeling , 2017 .

[74]  Malcolm R. Forster,et al.  How to Tell When Simpler, More Unified, or Less Ad Hoc Theories will Provide More Accurate Predictions , 1994, The British Journal for the Philosophy of Science.

[75]  K. Jöreskog A General Method for Estimating a Linear Structural Equation System. , 1970 .

[76]  Joseph F. Hair,et al.  Estimation issues with PLS and CBSEM: Where the bias lies! ☆ , 2016 .

[77]  Michel Tenenhaus,et al.  PLS path modeling , 2005, Comput. Stat. Data Anal..

[78]  Dursun Delen,et al.  Development of a structural equation modeling-based decision tree methodology for the analysis of lung transplantations , 2011, Decis. Support Syst..

[79]  S. Vrieze Model selection and psychological theory: a discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). , 2012, Psychological methods.

[80]  Spyros Makridakis,et al.  Accuracy measures: theoretical and practical concerns☆ , 1993 .

[81]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[82]  T. C. Melewar,et al.  Measuring reputation in global markets—A comparison of reputation measures’ convergent and criterion validities , 2013 .

[83]  William Lewis,et al.  Does PLS Have Advantages for Small Sample Size or Non-Normal Data? , 2012, MIS Q..

[84]  Christopher Hitchcock,et al.  Prediction Versus Accommodation and the Risk of Overfitting , 2004, The British Journal for the Philosophy of Science.

[85]  Wynne W. Chin,et al.  The case of partial least squares (PLS) path modeling in managerial accounting research , 2017 .

[86]  Viswanath Venkatesh,et al.  Predicting Different Conceptualizations of System Use: The Competing Roles of Behavioral Intention, Facilitating Conditions, and Behavioral Expectation , 2008, MIS Q..

[87]  Marko Sarstedt,et al.  An assessment of the use of partial least squares structural equation modeling in marketing research , 2012 .

[88]  Galit Shmueli,et al.  To Explain or To Predict? , 2010 .

[89]  Christian Nitzl,et al.  The use of partial least squares structural equation modelling (PLS-SEM) in management accounting research: Directions for future theory development , 2016 .

[90]  W. Reinartz,et al.  An Empirical Comparison of the Efficacy of Covariance-Based and Variance-Based SEM , 2009 .

[91]  Marc A. Tomiuk,et al.  A Comparative Study on Parameter Recovery of Three Approaches to Structural Equation Modeling , 2010 .

[92]  M. Stone An Asymptotic Equivalence of Choice of Model by Cross‐Validation and Akaike's Criterion , 1977 .

[93]  Selim Zaim,et al.  Universal structure modeling approach to customer satisfaction index , 2013, Ind. Manag. Data Syst..

[94]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[95]  Marko Sarstedt,et al.  The Use of Partial Least Squares Structural Equation Modeling in Strategic Management Research: A Review of Past Practices and Recommendations for Future Applications , 2012 .