Regression Analysis in Small-n-Large-p Using Interactive Prior Elicitation of Pairwise Similarities

In this extended abstract we introduce a new method for eliciting experts, prior knowledge about the similarity of the roles of features in the prediction task. The key idea is to use an interactive multidimensional-scaling-type scatterplot display of the features to elicit the similarity relationships, and then use the elicited relationships in the prior distribution of prediction parameters. Specifically, for learning to predict a target variable with Bayesian linear regression, the feature relationships are used as prior for the correlations of the regression coefficients. Simulation results together with a preliminary real user study on text data confirm that prior elicitation of feature similarities improves prediction accuracy.