System-Wide Peripheral Biomarker Discovery Using Information Theory

The identification of reliable peripheral biomarkers for clinical diagnosis, patient prognosis, and biological functional studies would allow for access to biological information currently available only through invasive methods. Traditional approaches have so far considered aspects of tissues and biofluid markers independently. Here we introduce an information theoretic framework for biomarker discovery, integrating biofluid and tissue information. This allows us to identify tissue information in peripheral biofluids. We treat tissue-biofluid interactions as an information channel through functional space using 26 proteomes from 45 different sources to determine quantitatively the correspondence of each biofluid for specific tissues via relative entropy calculation of proteomes mapped onto phenotype, function, and drug space. Next, we identify candidate biofluids and biomarkers responsible for functional information transfer (p < 0.01). A total of 851 unique candidate biomarkers proxies were identified. The biomarkers were found to be significant functional tissue proxies compared to random proteins (p < 0.001). This proxy link is found to be further enhanced by filtering the biofluid proteins to include only significant tissue-biofluid information channels and is further validated by gene expression. Furthermore, many of the candidate biomarkers are novel and have yet to be explored. In addition to characterizing proteins and their interactions with a systemic perspective, our work can be used as a roadmap to guide biomedical investigation, from suggesting biofluids for study to constraining the search for biomarkers. This work has applications in disease screening, diagnosis, and protein function studies.

[1]  Nature Genetics , 1991, Nature.

[2]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[3]  V. Seagroatt An introduction to medical statistics (2nd ed.) , 1996 .

[4]  A. Dunker The pacific symposium on biocomputing , 1998 .

[5]  F. Hirsch,et al.  From Expert Review of Molecular Diagnostics , 2010 .

[6]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[7]  M. Greenwood An Introduction to Medical Statistics , 1932, Nature.

[8]  E. Kandel,et al.  Proceedings of the National Academy of Sciences of the United States of America. Annual subject and author indexes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Jorma Isola,et al.  Gene amplification, mutation, and protein expression of EGFR and mutations of ERBB2 in serous ovarian carcinoma , 2006, Journal of Molecular Medicine.

[10]  W. E. Gye,et al.  CANCER RESEARCH , 1923, British medical journal.

[11]  A. Wear CIRCULATION , 1964, The Lancet.