Maximum Likelihood Methods for Data Mining in Datasets Represented by Graphs

Due to the boom in complex network research, large graph datasets appeared in various fields, from social sciences (P. Holme et al., 2004) to computer science (C.R. Myers, 2003), (M.Faloutsos et al., 1999), (A-L. Barabasi and R. Albert, 1999) and biology (L. Negyessy et al., 2006). There is an increasing demand for data mining methods that allow scientists to make sense of the datasets they encounter. In this paper, we present two graph models and two maximum likelihood algorithms that fit these models to pre-defined data. We also show two example applications to illustrate that these algorithms are able to extract interesting and meaningful properties from the data represented by appropriate graphs.

[1]  R. Fisher,et al.  On the Mathematical Foundations of Theoretical Statistics , 1922 .

[2]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[3]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[4]  J. Aldrich R.A. Fisher and the making of maximum likelihood 1912-1922 , 1997 .

[5]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[6]  Petter Holme,et al.  Structure and time evolution of an Internet dating community , 2002, Soc. Networks.

[7]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  T. Nepusz,et al.  Likelihood-based Clustering of Directed Graphs , 2007, 2007 International Symposium on Computational Intelligence and Intelligent Informatics.

[9]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[10]  László Kocsis,et al.  Prediction of the main cortical areas and connections involved in the tactile function of the visual cortex by network analysis , 2006, The European journal of neuroscience.

[11]  Christopher R. Myers,et al.  Software systems as complex networks: structure, function, and evolvability of software collaboration graphs , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.