Network Modeling and Pathway Inference from Incomplete Data ("PathInf")

In this work, we developed a network inference method from incomplete data ("PathInf") , as massive and non-uniformly distributed missing values is a common challenge in practical problems. PathInf is a two-stages inference model. In the first stage, it applies a data summarization model based on maximum likelihood to deal with the massive distributed missing values by transforming the observation-wise items in the data into state matrix. In the second stage, transition pattern (i.e. pathway) among variables is inferred as a graph inference problem solved by greedy algorithm with constraints. The proposed method was validated and compared with the state-of-art Bayesian network method on the simulation data, and shown consistently superior performance. By applying the PathInf on the lymph vascular metastasis data, we obtained the holistic pathways of the lymph node metastasis with novel discoveries on the jumping metastasis among nodes that are physically apart. The discovery indicates the possible presence of sentinel node groups in the lung lymph nodes which have been previously speculated yet never found. The pathway map can also improve the current dissection examination protocol for better individualized treatment planning, for higher diagnostic accuracy and reducing the patients trauma.

[1]  Jure Leskovec,et al.  Inferring Networks of Diffusion and Influence , 2012, ACM Trans. Knowl. Discov. Data.

[2]  J. Usuda,et al.  Do tumours located in the left lower lobe have worse outcomes in lymph node-positive non-small cell lung cancer than tumours in other lobes? , 2012, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[3]  K. Phan,et al.  Mediastinal lymphadenectomy fulfilling NCCN criteria may improve the outcome of clinical N0-1 and pathological N2 non-small cell lung cancer. , 2016, Journal of thoracic disease.

[4]  Xintao Hu,et al.  Inferring consistent functional interaction patterns from natural stimulus FMRI data , 2012, NeuroImage.

[5]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[6]  Rafael Rumí,et al.  Bayesian networks in environmental modelling , 2011, Environ. Model. Softw..

[7]  P. Mordant,et al.  Is the lymphatic drainage of lung cancer lobe-specific? A surgical appraisal. , 2015, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[8]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[9]  Nahla Ben Amor,et al.  Learning and Evaluating Bayesian Network Equivalence Classes from Incomplete Data , 2008, Int. J. Pattern Recognit. Artif. Intell..

[10]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[11]  Ilan Lobel,et al.  BAYESIAN LEARNING IN SOCIAL NETWORKS , 2008 .

[12]  Shyam Visweswaran,et al.  Learning genetic epistasis using Bayesian network scoring criteria , 2010, BMC Bioinformatics.

[13]  T. Hirai,et al.  Interlobar lymph node metastases according to primary tumor location in lung cancer. , 2002, Lung cancer.

[14]  Haiquan Chen,et al.  Selective lymph node dissection in early-stage non-small cell lung cancer. , 2017, Journal of thoracic disease.

[15]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[16]  Junwei Han,et al.  Inferring functional interaction and transition patterns via dynamic bayesian variable partition models , 2013, Human brain mapping.

[17]  C. Mountain,et al.  Regional lymph node classification for lung cancer staging. , 1997, Chest.

[18]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[19]  S. Miyano,et al.  Finding Optimal Bayesian Network Given a Super-Structure , 2008 .