A generic finite automata based approach to implementing lymphocyte repertoire models

Artificial immune systems (AIS) inspired by lymphocyte repertoires include negative and positive selection, clonal selection, and B~cell algorithms. Such AISs are used in computer science for machine learning and optimization, and in biology for modeling of fundamental immunological processes. In both cases, the necessary size of repertoire models can be huge. Here, we show that when lymphocyte repertoire models based on string patterns can be compactly represented as finite automata (FA), this allows to efficiently perform negative selection, positive selection, insertion into, deletion from, uniform sampling from, and counting the repertoire. Specifically, for r-contiguous pattern matching, all these tasks can be performed in polynomial time. But even in NP-hard cases like Hamming distance matching, the FA representation can still lead to practically important efficiency gains. We demonstrate the feasibility and flexibility of this approach by implementing T~cell positive selection simulations based on human genomic data using four different pattern rules. Hence, FA-based repertoire models generalize previous efficient negative selection algorithms to perform several related algorithmic tasks, are easy to implement and customize, and are applicable to real-world bioinformatic problems.

[1]  Bin Ma,et al.  Distinguishing string selection problems , 2003, SODA '99.

[2]  Clemencia Pinilla,et al.  Derivation of an amino acid similarity matrix for peptide:MHC binding and its application as a Bayesian prior , 2009, BMC Bioinformatics.

[3]  Stephanie Forrest,et al.  Infect Recognize Destroy , 1996 .

[4]  Omer Giménez,et al.  A Linear Algorithm for the Random Sampling from Regular Languages , 2010, Algorithmica.

[5]  Johannes Textor,et al.  Search and Learning in the Immune System: Models of Immune Surveillance and Negative Selection , 2013, it Inf. Technol..

[6]  A S Perelson,et al.  Using lazy evaluation to simulate realistic-size repertoires in models of the immune system , 1997, Bulletin of mathematical biology.

[7]  Stephanie Forrest,et al.  The effects of thymic selection on the range of T cell cross‐reactivity , 2005, European journal of immunology.

[8]  Todd M. Allen,et al.  Effects of thymic selection of the T cell repertoire on HLA-class I associated control of HIV infection , 2010, Nature.

[9]  O. Lund,et al.  NetMHCpan, a method for MHC class I binding prediction beyond humans , 2008, Immunogenetics.

[10]  Michael Elberfeld,et al.  Negative selection algorithms on strings with efficient training and linear-time classification , 2011, Theor. Comput. Sci..

[11]  Claudia Eckert,et al.  Is negative selection appropriate for anomaly detection? , 2005, GECCO '05.

[12]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[13]  G. Oster,et al.  Theoretical studies of clonal selection: minimal antibody repertoire size and reliability of self-non-self discrimination. , 1979, Journal of theoretical biology.

[14]  Alan S. Perelson,et al.  Self-nonself discrimination in a computer , 1994, Proceedings of 1994 IEEE Computer Society Symposium on Research in Security and Privacy.

[15]  Thomas Stibor,et al.  Foundations of r-contiguous matching in negative selection for anomaly detection , 2009, Natural Computing.

[16]  Mehran Kardar,et al.  How the thymus designs antigen-specific and self-tolerant T cell receptor sequences , 2008, Proceedings of the National Academy of Sciences.

[17]  Zhou Ji,et al.  Revisiting Negative Selection Algorithms , 2007, Evolutionary Computation.

[18]  María Martín,et al.  Ongoing and future developments at the Universal Protein Resource , 2010, Nucleic Acids Res..

[19]  Morten Nielsen,et al.  Immunological bioinformatics , 2005, Computational molecular biology.

[20]  Sampath Kannan,et al.  Counting and random generation of strings in regular languages , 1995, SODA '95.

[21]  Dexter Kozen,et al.  Lower bounds for natural proof systems , 1977, 18th Annual Symposium on Foundations of Computer Science (sfcs 1977).

[22]  Johannes Textor,et al.  A Comparative Study of Negative Selection Based Anomaly Detection in Sequence Data , 2012, ICARIS.

[23]  Arup K Chakraborty,et al.  Quorum sensing allows T cells to discriminate between self and nonself , 2013, Proceedings of the National Academy of Sciences.

[24]  Johannes Textor,et al.  Efficient Negative Selection Algorithms by Sampling and Approximate Counting , 2012, PPSN.

[25]  Stephanie Forrest,et al.  Coverage and Generalization in an Artificial Immune System , 2002, GECCO.

[26]  Maciej Liskiewicz,et al.  Negative selection algorithms without generating detectors , 2010, GECCO '10.

[27]  Dominique Revuz,et al.  Minimisation of Acyclic Deterministic Automata in Linear Time , 1992, Theor. Comput. Sci..

[28]  Rob J. De Boer,et al.  Degenerate T-cell Recognition of Peptides on MHC Molecules Creates Large Holes in the T-cell Repertoire , 2012, PLoS Comput. Biol..

[29]  Stephanie Forrest,et al.  A stochastic model of cytotoxic T cell responses. , 2004, Journal of theoretical biology.

[30]  A. Perelson,et al.  Predicting the size of the T-cell receptor and antibody combining region from consideration of efficient self-nonself discrimination. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Peter J. Bentley,et al.  An evaluation of negative selection in an artificial immune system for network intrusion detection , 2001 .