NMF based sparse Cholesky decomposition technique for dimensionality scale back in large data sets

This paper aims at resolving the issues related to increased dimensionality of data in data mining. In this paper, Sparse Cholesky decomposition (SCD) is combined with Non-integer Matrix Factorization (NMF) to remove the problems arising due to increased data dimensionality. The increased data dimensionality in datasets is probably due to non-orthogonality of datasets. The complex conjugate values is used to remove the sparse matrix and a complex gradient algorithm reduces the sparse matrix by the extraction of conjugate values. The SCD-MNF extracts the feature vector and upper triangular matrix linearly maps the feature vector obtained from the SCD. Hence, NMF is employed with SCD for structuring the datasets and this helps to form a well-defined data geometry. The proposed system is evaluated against normalized mutual information and accuracy against different text datasets. The results prove that SCD-NMF attains better results than conventional methods in finding the instances related to the given query.

[1]  Zdenek Dostál,et al.  Cholesky decomposition of a positive semidefinite matrix with known kernel , 2011, Appl. Math. Comput..

[2]  P. Paatero,et al.  Positive matrix factorization applied to a curve resolution problem , 1998 .

[3]  Daojiang He,et al.  Estimation of the Cholesky decomposition in a conditional independent normal model with missing data , 2014 .

[4]  Heng Lian,et al.  A new nested Cholesky decomposition and estimation for the covariance matrix of bivariate longitudinal data , 2016, Comput. Stat. Data Anal..

[5]  U. Kruger,et al.  Fast Moving Window Algorithm for QR and Cholesky Decompositions , 2008 .

[6]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[7]  Nicolás García-Pedrajas,et al.  Democratic instance selection: A linear complexity instance selection algorithm based on classifier ensemble concepts , 2010, Artif. Intell..

[8]  Ning-Zhong Shi,et al.  Efficient semiparametric estimation via Cholesky decomposition for longitudinal data , 2011, Comput. Stat. Data Anal..

[9]  Minghui Wang,et al.  A structure-preserving algorithm for the quaternion Cholesky decomposition , 2013, Appl. Math. Comput..

[10]  V. Madar Direct formulation to Cholesky decomposition of a general nonsingular correlation matrix. , 2014, Statistics & probability letters.

[11]  Guoqing Huang,et al.  New formulation of Cholesky decomposition and applications in stochastic simulation , 2013 .

[12]  Haixian Wang,et al.  An efficient algorithm for generalized discriminant analysis using incomplete Cholesky decomposition , 2007, Pattern Recognit. Lett..

[13]  Thomas Romary,et al.  Incomplete Cholesky decomposition for the kriging of large datasets , 2013 .

[14]  Charles A. Micchelli,et al.  On Spectral Learning , 2010, J. Mach. Learn. Res..

[15]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[16]  H. Harbrecht,et al.  On the low-rank approximation by the pivoted Cholesky decomposition , 2012 .