Learning Signaling Network Structures with Sparsely Distributed Data

Flow cytometric measurement of signaling protein abundances has proved particularly useful for elucidation of signaling pathway structure. The single cell nature of the data ensures a very large dataset size, providing a statistically robust dataset for structure learning. Moreover, the approach is easily scaled to many conditions in high throughput. However, the technology suffers from a dimensionality constraint: at the cutting edge, only about 12 protein species can be measured per cell, far from sufficient for most signaling pathways. Because the structure learning algorithm (in practice) requires that all variables be measured together simultaneously, this restricts structure learning to the number of variables that constitute the flow cytometer's upper dimensionality limit. To address this problem, we present here an algorithm that enables structure learning for sparsely distributed data, allowing structure learning beyond the measurement technology's upper dimensionality limit for simultaneously measurable variables. The algorithm assesses pairwise (or n-wise) dependencies, constructs "Markov neighborhoods" for each variable based on these dependencies, measures each variable in the context of its neighborhood, and performs structure learning using a constrained search.

[1]  T. Jaakkola,et al.  Bayesian Network Approach to Cell Signaling Pathway Modeling , 2002, Science's STKE.

[2]  Scott D. Tanner,et al.  Multiplex bio-assay with inductively coupled plasma mass spectrometry: Towards a massively multivariate single-cell technology , 2007 .

[3]  R G Sweet,et al.  Fluorescence Activated Cell Sorting , 2020, Definitions.

[4]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[5]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[6]  Peter J. Woolf,et al.  Bayesian analysis of signaling networks governing embryonic stem cell fate decisions , 2005, Bioinform..

[7]  Jonathan M Irish,et al.  Analysis of protein phosphorylation and cellular signaling events by flow cytometry: techniques and clinical applications. , 2004, Clinical immunology.

[8]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[9]  Garry P. Nolan,et al.  Simultaneous measurement of multiple active kinase states using polychromatic flow cytometry , 2002, Nature Biotechnology.

[10]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[11]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[12]  O. Perez,et al.  Phospho‐proteomic immune analysis by flow cytometry: from mechanism to translational medicine at the single‐cell level , 2006, Immunological reviews.

[13]  P. Chattopadhyay,et al.  Seventeen-colour flow cytometry: unravelling the immune system , 2004, Nature Reviews Immunology.

[14]  Garry P Nolan,et al.  Flow cytometric analysis of kinase signaling cascades. , 2004, Methods in molecular biology.

[15]  Peter O. Krutzik,et al.  Intracellular phospho‐protein staining techniques for flow cytometry: Monitoring single cell signaling events , 2003, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[16]  Mtw,et al.  Computation, causation, and discovery , 2000 .

[17]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.