Network linear discriminant analysis

Linear discriminant analysis (LDA) is one of the most popularly used classification methods. With the rapid advance of information technology, network data are becoming increasingly available. A novel method called network linear discriminant analysis (NLDA) is proposed to deal with the classification problem for network data. The NLDA model takes both network information and predictive variables into consideration. Theoretically, the misclassification rate is studied and an upper bound is derived under mild conditions. Furthermore, it is observed that real networks are often sparse in structure. As a result, asymptotic performance of NLDA is also obtained under certain sparsity assumptions. In order to evaluate the finite sample performance of the newly proposed methodology, a number of simulation studies are conducted. Lastly, a real data analysis about Sina Weibo is also presented for illustration purpose.

[1]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[2]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[3]  P. Holland,et al.  An Exponential Family of Probability Distributions for Directed Graphs , 1981 .

[4]  Runze Li,et al.  Ultrahigh-Dimensional Multiclass Linear Discriminant Analysis by Pairwise Sure Independence Screening , 2016, Journal of the American Statistical Association.

[5]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[6]  J. Shao,et al.  Sparse linear discriminant analysis by thresholding for high dimensional data , 2011, 1105.3561.

[7]  R. Tibshirani,et al.  Penalized classification using Fisher's linear discriminant , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[8]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[9]  Jiashun Jin,et al.  Fast network community detection by SCORE , 2012, ArXiv.

[10]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[11]  Jiashun Jin,et al.  FAST COMMUNITY DETECTION BY SCORE , 2012, 1211.5803.

[12]  Peng Wang,et al.  Recent developments in exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[13]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[14]  Gareth M. James,et al.  Functional linear discriminant analysis for irregularly sampled curves , 2001 .

[15]  Yuguo Chen,et al.  Latent Space Models for Dynamic Networks , 2015, 2005.08808.

[16]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[17]  P. Bickel,et al.  Some theory for Fisher''s linear discriminant function , 2004 .

[18]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Ji Zhu,et al.  Consistency of community detection in networks under degree-corrected stochastic block models , 2011, 1110.3854.

[20]  Kalyan Moy Gupta,et al.  Cautious Inference in Collective Classification , 2007, AAAI.

[21]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[22]  Tai Qin,et al.  Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel , 2013, NIPS.

[23]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[24]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[25]  Trevor Hastie,et al.  Regularized linear discriminant analysis and its application in microarrays. , 2007, Biostatistics.