Recovering probabilities for nucleotide trimming processes for T cell receptor TRA and TRG V-J junctions analyzed with IMGT tools

BackgroundNucleotides are trimmed from the ends of variable (V), diversity (D) and joining (J) genes during immunoglobulin (IG) and T cell receptor (TR) rearrangements in B cells and T cells of the immune system. This trimming is followed by addition of nucleotides at random, forming the N regions (N for nucleotides) of the V-J and V-D-J junctions. These processes are crucial for creating diversity in the immune response since the number of trimmed nucleotides and the number of added nucleotides vary in each B or T cell. IMGT® sequence analysis tools, IMGT/V-QUEST and IMGT/JunctionAnalysis, are able to provide detailed and accurate analysis of the final observed junction nucleotide sequences (tool "output"). However, as trimmed nucleotides can potentially be replaced by identical N region nucleotides during the process, the observed "output" represents a biased estimate of the "true trimming process."ResultsA probabilistic approach based on an analysis of the standardized tool "output" is proposed to infer the probability distribution of the "true trimmming process" and to provide plausible biological hypotheses explaining this process. We collated a benchmark dataset of TR alpha (TRA) and TR gamma (TRG) V-J rearranged sequences and junctions analysed with IMGT/V-QUEST and IMGT/JunctionAnalysis, the nucleotide sequence analysis tools from IMGT®, the international ImMunoGeneTics information system®, http://imgt.cines.fr. The standardized description of the tool output is based on the IMGT-ONTOLOGY axioms and concepts. We propose a simple first-order model that attempts to transform the observed "output" probability distribution into an estimate closer to the "true trimming process" probability distribution. We use this estimate to test the hypothesis that Poisson processes are involved in trimming. This hypothesis was not rejected at standard confidence levels for three of the four trimming processes: TRAV, TRAJ and TRGV.ConclusionBy using trimming of rearranged TR genes as a benchmark, we show that a probabilistic approach, applied to IMGT® standardized tool "outputs" opens the way to plausible hypotheses on the events involved in the "true trimming process" and eventually to an exact quantification of trimming itself. With increasing high-throughput of standardized immunogenetics data, similar probabilistic approaches will improve understanding of processes so far only characterized by the "output" of standardized tools.

[1]  Susumu Tonegawa,et al.  Junctional sequences of T cell receptor γδ genes: Implications for γδ T cell lineages and for a novel intermediate of V-(D)-J joining , 1989, Cell.

[2]  Mathieu Rouard,et al.  IMGT unique numbering for immunoglobulin and T cell receptor constant domains and Ig superfamily C-like domains. , 2005, Developmental and comparative immunology.

[3]  Hitoshi Sakano,et al.  Sequences at the somatic recombination sites of immunoglobulin light-chain genes , 1979, Nature.

[4]  F. Alt,et al.  Joining of immunoglobulin heavy chain gene segments: implications from a chromosome with evidence of three D-JH fusions. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Marie-Paule Lefranc,et al.  IMGT/V-QUEST, an integrated software program for immunoglobulin and T cell receptor VJ and VDJrearrangement analysis , 2004, Nucleic Acids Res..

[6]  F. Papavasiliou,et al.  V(D)J Recombination and the Evolution of the Adaptive Immune System , 2003, PLoS biology.

[7]  S. Tonegawa,et al.  Junctional sequences of T cell receptor gamma delta genes: implications for gamma delta T cell lineages and for a novel intermediate of V-(D)-J joining. , 1989, Cell.

[8]  K. Calame,et al.  An lmmunoglobulin Heavy Chain Variable Region Gene Is Generated from Three Segments of DNA : VH , 2004 .

[9]  Marie-Paule Lefranc,et al.  IMGT/JunctionAnalysis: the first tool for the analysis of the immunoglobulin and T cell receptor complex V-J and V-D-J JUNCTIONs , 2004, ISMB/ECCB.

[10]  L. Hood,et al.  An immunoglobulin heavy chain variable region gene is generated from three segments of DNA: VH, D and JH , 1980, Cell.

[11]  Quentin Kaas,et al.  IMGT unique numbering for MHC groove G-DOMAIN and MHC superfamily (MhcSF) G-LIKE-DOMAIN. , 2005, Developmental and comparative immunology.

[12]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[13]  Jérôme Lane,et al.  IMGT-Kaleidoscope, the formal IMGT-ONTOLOGY paradigm. , 2008, Biochimie.

[14]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[15]  Hitoshi Sakano,et al.  T cell receptor β gene sequences in the circular DNA of thymocyte nuclei: Direct evidence for intramolecular DNA deletion in V-D-J joining , 1987, Cell.

[16]  Marie-Paule Lefranc,et al.  IMGT-Choreography for immunogenetics and immunoinformatics , 2004, Silico Biol..

[17]  Marie-Paule Lefranc,et al.  Ontology for immunogenetics: the IMGT-ONTOLOGY , 1999, Bioinform..

[18]  Susumu Tonegawa,et al.  A Complete Immunoglobulin Gene Is Created by Somatic Recombination , 2004 .

[19]  Marie-Paule Lefranc,et al.  IMGT-ONTOLOGY for immunogenetics and immunoinformatics , 2003, Silico Biol..

[20]  S. Lewis,et al.  P nucleotide insertions and the resolution of hairpin DNA structures in mammalian cells. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Gérard Lefranc,et al.  The T cell receptor factsbook , 2001 .

[22]  Ruth Fritsch,et al.  The NF-kappaB canonical pathway is involved in the control of the exonucleolytic processing of coding ends during V(D)J recombination. , 2008, Journal of immunology.

[23]  Susumu Tonegawa,et al.  A complete immunoglobulin gene is created by somatic recombination , 1978, Cell.

[24]  Jérôme Lane,et al.  IMGT®, the international ImMunoGeneTics information system® , 2004, Nucleic Acids Res..

[25]  Gérard Lefranc,et al.  The Immunoglobulin FactsBook , 2001 .

[26]  P. Swanson,et al.  Evidence for Ku70/Ku80 association with full-length RAG1 , 2007, Nucleic acids research.

[27]  D. Schatz,et al.  Biochemistry of V(D)J recombination. , 2005, Current topics in microbiology and immunology.

[28]  Mark M. Davis,et al.  Pillars Article: An Immunoglobulin Heavy Chain Variable Region Gene Is Generated from Three Segments of DNA: VH, D and JH. Cell, 1980, 19: 981–992. , 2004 .

[29]  Tim Hunkapiller,et al.  The joining of V and J gene segments creates antibody diversity , 1980, Nature.

[30]  B. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[31]  H. Chernoff,et al.  The Use of Maximum Likelihood Estimates in {\chi^2} Tests for Goodness of Fit , 1954 .

[32]  V. Giudicelli,et al.  IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. , 2003, Developmental and comparative immunology.

[33]  Nancy S. Longo,et al.  The NF-κB Canonical Pathway Is Involved in the Control of the Exonucleolytic Processing of Coding Ends during V(D)J Recombination1 , 2008, The Journal of Immunology.

[34]  Patrice Duroux,et al.  IMGT/LIGM-DB, the IMGT® comprehensive database of immunoglobulin and T cell receptor nucleotide sequences , 2005, Nucleic Acids Res..

[35]  Erwin Kreyszig,et al.  Introductory Mathematical Statistics. , 1970 .

[36]  S. Takeshita,et al.  Structure of extrachromosomal circular DNAs excised from T-cell antigen receptor alpha and delta-chain loci. , 1988, Journal of molecular biology.

[37]  K. Schwarz,et al.  Extent to which hairpin opening by the Artemis:DNA-PKcs complex can contribute to junctional diversity in V(D)J recombination , 2007, Nucleic acids research.

[38]  Marie-Paule Lefranc,et al.  WHO-IUIS Nomenclature Subcommittee for immunoglobulins and T cell receptors report , 2007, Immunogenetics.

[39]  Marie-Paule Lefranc,et al.  IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis , 2008, Nucleic Acids Res..

[40]  Marie-Paule Lefranc WHO-IUIS Nomenclature Subcommittee for immunoglobulins and T cell receptors report August 2007, 13th International Congress of Immunology, Rio de Janeiro, Brazil. , 2008, Developmental and comparative immunology.

[41]  S. Tonegawa,et al.  Somatic generation of antibody diversity. , 1976, Nature.

[42]  Marie-Paule Lefranc,et al.  IMGT Standardization for Statistical Analyses of T Cell Receptor Junctions: The TRAV-TRAJ Example , 2006, Silico Biol..