Chromatic kernel and its applications

In this paper, we study the following Chromatic kernel (CK) problem: given an $$n$$n-partite graph (called a chromatic correlation graph) $$G=(V,E)$$G=(V,E) with $$V=V_{1}\bigcup \cdots \bigcup V_{n}$$V=V1⋃⋯⋃Vn and each partite set $$V_{i}$$Vi containing a constant number $$\lambda $$λ of vertices, compute a subgraph $$G[V_{CK}]$$G[VCK] of $$G$$G with exactly one vertex from each partite set and the maximum number of edges or the maximum total edge weight, if $$G$$G is edge-weighted (among all such subgraphs). CK is a new problem motivated by several applications and no existing algorithm directly solves it. In this paper, we first show that CK is NP-hard even if $$\lambda =2$$λ=2, and cannot be approximated within a factor of $$16/17$$16/17 unless P = NP. Then, we present a random-sampling-based PTAS for dense CK. As its application, we show that CK can be used to determine the pattern of chromosome associations in the nucleus for a population of cells. We test our approach by using random and real biological data; experimental results suggest that our approach yields near optimal solutions, and significantly outperforms existing approaches.

[1]  Jiming Peng,et al.  Generalized median graphs and applications , 2009, J. Comb. Optim..

[2]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[3]  Ronald Berezney,et al.  Regulating the mammalian genome: the role of nuclear architecture. , 2002, Advances in enzyme regulation.

[4]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[5]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[6]  Jinhui Xu,et al.  Computing Maximum Association Graph in Microscopic Nucleus Images , 2010, MICCAI.

[7]  Luca Trevisan,et al.  Gadgets, approximation, and linear programming , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[8]  Mona Singh,et al.  Solving and analyzing side-chain positioning problems using linear and integer programming , 2005, Bioinform..

[9]  Marek Karpinski,et al.  Polynomial Time Approximation Schemes for Dense Instances of NP-Hard Problems , 1999, J. Comput. Syst. Sci..

[10]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[11]  Hisao Tamaki,et al.  Greedily Finding a Dense Subgraph , 2000, J. Algorithms.

[12]  Vikas Singh,et al.  Generalized Median Graphs: Theory and Applications , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Uriel Feige,et al.  The Dense k -Subgraph Problem , 2001, Algorithmica.

[14]  Anthony Wirth,et al.  Correlation Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[15]  Johan Håstad,et al.  Some optimal inapproximability results , 2001, JACM.

[16]  Mona Singh,et al.  A Semidefinite Programming Approach to Side Chain Positioning with New Rounding Strategies , 2004, INFORMS J. Comput..

[17]  T. Cremer,et al.  Chromosome territories, nuclear architecture and gene regulation in mammalian cells , 2001, Nature Reviews Genetics.

[18]  Sanjeev Arora,et al.  Computational Complexity: A Modern Approach , 2009 .

[19]  Yikun Ban On Finding Dense Subgraphs in Bipartite Graphs: Linear Algorithms , 2018, ArXiv.

[20]  Jinhui Xu,et al.  Solving the Chromatic Cone Clustering Problem via Minimum Spanning Sphere , 2011, ICALP.

[21]  Reid Andersen,et al.  A local algorithm for finding dense subgraphs , 2007, TALG.

[22]  Sergei Vassilvitskii,et al.  Finding the Jaccard median , 2010, SODA '10.