论文信息 - Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment

Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment

We introduce an unsupervised process to co-cluster groups of stocks and financial ratios, so that investors can gain more insight on how they are correlated. Our idea for the co-clustering is based on a graph concept called maximal quasi-bicliques, which can tolerate erroneous or/and missing information that are common in the stock and financial ratio data. Compared to previous works, our maximal quasi-bicliques require the errors to be evenly distributed, which enable us to capture more meaningful co-clusters. We develop a new algorithm that can efficiently enumerate maximal quasi-bicliques from an undirected graph. The concept of maximal quasi-bicliques is domain-independent; it can be extended to perform co-clustering on any set of data that are modeled by graphs.

[1] J. G. Burleigh,et al. Identifying optimal incomplete phylogenetic data sets from sequence databases. , 2005, Molecular phylogenetics and evolution.

[2] Sandra Sudarsky,et al. Massive Quasi-Clique Detection , 2002, LATIN.

[3] D. Bu,et al. Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[4] Petra Perner,et al. Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[5] Tsuyoshi Murata. Discovery of User Communities from Web Audience Measurement Data , 2006 .

[6] Jinyan Li,et al. Efficient Mining of Large Maximal Bicliques , 2006, DaWaK.

[7] Jinyan Li,et al. Bioinformatics Original Paper Discovering Motif Pairs at Interaction Sites from Protein Sequences on a Proteome-wide Scale , 2022 .

[8] Dana Ron,et al. A New Conceptual Clustering Framework , 2004, Machine Learning.

[9] David Eppstein,et al. Arboricity and Bipartite Subgraph Listing Algorithms , 1994, Inf. Process. Lett..