Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Conjoint analysis is a popular experimental design used to measure multidimensional preferences. Researchers examine how varying a factor of interest, while controlling for other relevant factors, influences decision-making. Currently, there exist two methodological approaches to analyzing data from a conjoint experiment. The first focuses on estimating the average marginal effects of each factor while averaging over the other factors. Although this allows for straightforward design-based estimation, the results critically depend on the distribution of other factors and how interaction effects are aggregated. An alternative model-based approach can compute various quantities of interest, but requires researchers to correctly specify the model, a challenging task for conjoint analysis with many factors and possible interactions. In addition, a commonly used logistic regression has poor statistical properties even with a moderate number of factors when incorporating interactions. We propose a new hypothesis testing approach based on the conditional randomization test to answer the most fundamental question of conjoint analysis: Does a factor of interest matter in any way given the other factors? Our methodology is solely based on the randomization of factors, and hence is free from assumptions. Yet, it allows researchers to use any test statistic, including those based on complex machine learning algorithms. As a result, we are able to combine the strengths of the existing design-based and model-based approaches. We illustrate the proposed methodology through conjoint analysis of immigration preferences and political candidate evaluation. We also extend the proposed approach to test for regularity assumptions commonly used in conjoint analysis.

[1]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[2]  Kosuke Imai,et al.  Experimental Evaluation of Individualized Treatment Rules , 2019, Journal of the American Statistical Association.

[3]  Chiara Sabatti,et al.  Causal inference in genetic trio studies , 2020, Proceedings of the National Academy of Sciences.

[4]  Benjamin L. Campbell,et al.  Consumer Preferences for Peach Attributes: Market Segmentation Analysis and Implications for New Marketing Strategies , 2013, Agricultural and Resource Economics Review.

[5]  E. Candès,et al.  The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression , 2018, The Annals of Statistics.

[6]  Denes Szucs,et al.  A Tutorial on Hunting Statistical Significance by Chasing N , 2016, Front. Psychol..

[7]  Jens Hainmueller,et al.  Public Attitudes toward Immigration , 2014 .

[8]  Economic Reasoning with a Racial Hue: Is the Immigration Consensus Purely Race Neutral? , 2019, The Journal of Politics.

[9]  Lisa A Prosser,et al.  Statistical Methods for the Analysis of Discrete-Choice Experiments: A Report of the ISPOR Conjoint Analysis Good Research Practices Task Force. , 2016, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[10]  Lucas Janson,et al.  Fast and powerful conditional randomization testing via distillation. , 2020, Biometrika.

[11]  Kosuke Imai,et al.  Estimating Heterogeneous Causal Effects of High-Dimensional Treatments: Application to Conjoint Analysis , 2022 .

[12]  The Ties that Double Bind: Social Roles and Women's Underrepresentation in Politics , 2017 .

[13]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[14]  Daniel J. Hopkins,et al.  Beyond the breaking point? Survey satisficing in conjoint experiments , 2017, Political Science Research and Methods.

[15]  K. Imai,et al.  Causal Interaction in Factorial Experiments: Application to Conjoint Analysis , 2018, Journal of the American Statistical Association.

[16]  Paul E. Green,et al.  Conjoint Analysis in Marketing: New Developments with Implications for Research and Practice , 1990 .

[17]  Paul E. Green,et al.  Thirty Years of Conjoint Analysis: Reflections and Prospects , 2001, Interfaces.

[18]  C. Andrade HARKing, Cherry-Picking, P-Hacking, Fishing Expeditions, and Data Dredging and Mining as Questionable Research Practices. , 2021, The Journal of clinical psychiatry.

[19]  J. I The Design of Experiments , 1936, Nature.

[20]  N. Meinshausen,et al.  High-Dimensional Inference: Confidence Intervals, $p$-Values and R-Software hdi , 2014, 1408.4026.

[21]  Milan Martic,et al.  Using Conjoint Analysis To Elicit Employers’ Preferences Toward Key Competencies For A Business Manager Position , 2012 .

[22]  Damaraju Raghavarao,et al.  Choice-Based Conjoint Analysis: Models and Designs , 2010 .

[23]  R. Luce,et al.  Simultaneous conjoint measurement: A new type of fundamental measurement , 1964 .

[24]  Daniel J. Hopkins,et al.  The Number of Choice Tasks and Survey Satisficing in Conjoint Experiments , 2017, Political Analysis.

[25]  Haoran Zhang,et al.  The Holdout Randomization Test: Principled and Easy Black Box Feature Selection , 2018, 1811.00645.

[26]  K. Imai,et al.  Improving the External Validity of Conjoint Analysis: The Essential Role of Profile Distribution , 2021, Political Analysis.

[27]  Jens Hainmueller,et al.  The Hidden American Immigration Consensus: A Conjoint Analysis of Attitudes Toward Immigrants , 2012 .

[28]  Scott F. Abramson,et al.  What Do We Learn about Voter Preferences from Conjoint Experiments? , 2022, American Journal of Political Science.

[29]  Stefano Barone,et al.  A weighted logistic regression for conjoint analysis and Kansei engineering , 2007, Qual. Reliab. Eng. Int..

[30]  Daniel J. Hopkins,et al.  Causal Inference in Conjoint Analysis: Understanding Multidimensional Choices via Stated Preference Experiments , 2013, Political Analysis.

[31]  P. Aronow A General Method for Detecting Interference Between Units in Randomized Experiments , 2010 .

[32]  Lucas Janson,et al.  Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection , 2016, 1610.02351.

[33]  Kirk Bansak,et al.  Using Conjoint Experiments to Analyze Elections: The Essential Role of the Average Marginal Component Effect (AMCE) , 2020 .

[34]  G. Imbens,et al.  Exact p-Values for Network Interference , 2015, 1506.02084.

[35]  Barry C. Burden,et al.  The Contingent Effects of Candidate Sex on Voter Choice , 2019 .