An R Package for Probabilistic Latent Feature Analysis of Two-Way Two-Mode Frequencies

A common strategy for the analysis of object-attribute associations is to derive a low- dimensional spatial representation of objects and attributes which involves a compensatory model (e.g., principal components analysis) to explain the strength of object-attribute associations. As an alternative, probabilistic latent feature models assume that objects and attributes can be represented as a set of binary latent features and that the strength of object-attribute associations can be explained as a non-compensatory (e.g., disjunctive or conjunctive) mapping of latent features. In this paper, we describe the R package plfm which comprises functions for conducting both classical and Bayesian probabilistic latent feature analysis with disjunctive or a conjunctive mapping rules. Print and summary functions are included to summarize results on parameter estimation, model selection and the goodness of fit of the models. As an example the functions of plfm are used to analyze product-attribute data on the perception of car models, and situation-behavior associations on the situational determinants of anger-related behavior.

[1]  I. Van Mechelen,et al.  The Real-Valued Model of Hierarchical Classes , 2011, J. Classif..

[2]  Jeroen K. Vermunt A hierarchical mixture model for clustering three-way data sets , 2007, Comput. Stat. Data Anal..

[3]  Sébastien Lê,et al.  FactoMineR: An R Package for Multivariate Analysis , 2008 .

[4]  Peter Kuppens,et al.  Every Cloud Has a Silver Lining: Interpersonal and Individual Differences Determinants of Anger-Related Behaviors , 2004, Personality & social psychology bulletin.

[5]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[6]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[7]  George R. Franke,et al.  Correspondence Analysis: Graphical Representation of Categorical Data in Marketing Research , 1986 .

[8]  Iven Van Mechelen,et al.  Constrained Latent Class Analysis of Three-Way Three-Mode Data , 2002, J. Classif..

[9]  Andrew Gelman,et al.  Bayesian Inference with Probability Matrix Decomposition Models , 2001 .

[10]  Jean Thioulouse,et al.  The ade4 package - I : One-table methods , 2004 .

[11]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[12]  Joeri Hofmans,et al.  TwoMP: A MATLAB graphical user interface for two-mode partitioning , 2009, Behavior research methods.

[13]  M. A. Tanner,et al.  Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions, 3rd Edition , 1998 .

[14]  Zoubin Ghahramani,et al.  Modeling Dyadic Data with Binary Latent Factors , 2006, NIPS.

[15]  I. Mechelen,et al.  A taxonomy of latent structure assumptions for probability matrix decomposition models , 2003 .

[16]  Anne-Béatrice Dufour,et al.  The ade4 Package: Implementing the Duality Diagram for Ecologists , 2007 .

[17]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[18]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[19]  S. Dumais Latent Semantic Analysis. , 2005 .

[20]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[21]  Michael Greenacre,et al.  Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca Package , 2007 .

[22]  I. Mechelen,et al.  Probability matrix decomposition models and main-effects generalized linear models for the analysis of replicated binary associations , 2002 .

[23]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[24]  A. Tversky Features of Similarity , 1977 .

[25]  Tammo H. A. Bijmolt,et al.  Assessing brand image through communalities and asymmetries in brand-to-attribute and attribute-to-brand associations , 2009, Eur. J. Oper. Res..

[26]  I. Mechelen,et al.  Probability matrix decomposition models , 1996 .

[27]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[28]  Geert Verbeke,et al.  Multiple Imputation for Model Checking: Completed‐Data Plots with Missing and Latent Data , 2005, Biometrics.

[29]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[30]  Iven Van Mechelen,et al.  Probabilistic feature analysis of facial perception of emotions , 2005 .

[31]  H. Akaike A new look at the statistical model identification , 1974 .

[32]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[33]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[34]  Eric Maris,et al.  Perceptual analysis of two-way two-mode frequency data: probability matrix decomposition and two alternatives , 1997 .

[35]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .