Prediction of the coding sequences of unidentified human genes. VII. The complete sequences of 100 new cDNA clones from brain which can code for large proteins in vitro.

In this series of projects of sequencing human cDNA clones which correspond to relatively long transcripts, we newly determined the entire sequences of 100 cDNA clones which were screened on the basis of the potentiality of coding for large proteins in vitro. The cDNA libraries used were the fractions with average insert sizes from 5.3 to 7.0 kb of the size-fractionated cDNA libraries from human brain. The randomly sampled clones were single-pass sequenced from both the ends to select clones that are not registered in the public database. Then their protein-coding potentialities were examined by an in vitro transcription/translation system, and the clones that generated proteins larger than 60 kDa were entirely sequenced. Each clone gave a distinct open reading frame (ORF), and the length of the ORF was roughly coincident with the approximate molecular mass of the in vitro product estimated from its mobility on SDS-polyacrylamide gel electrophoresis. The average size of the cDNA clones sequenced was 6.1 kb, and that of the ORFs corresponded to 1200 amino acid residues. By computer-assisted analysis of the sequences with DNA and protein-motif databases (GenBank and PROSITE databases), the functions of at least 73% of the gene products could be anticipated, and 88% of them (the products of 64 clones) were assigned to the functional categories of proteins relating to cell signaling/communication, nucleic acid managing, and cell structure/motility. The expression profiles in a variety of tissues and chromosomal locations of the sequenced clones have been determined. According to the expression spectra, approximately 11 genes appeared to be predominantly expressed in brain. Most of the remaining genes were categorized into one of the following classes: either the expression occurs in a limited number of tissues (31 genes) or the expression occurs ubiquitously in all but a few tissues (47 genes).

[1]  Jiahuai Han,et al.  NIK is a new Ste20‐related kinase that binds NCK and MEKK1 and activates the SAPK/JNK cascade via a conserved regulatory domain , 1997, The EMBO journal.

[2]  K. Matsushima,et al.  Molecular Cloning of a Gene Encoding a New Type of Metalloproteinase-disintegrin Family Protein with Thrombospondin Motifs as an Inflammation Associated Gene* , 1997, The Journal of Biological Chemistry.

[3]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. VIII. 78 new cDNA clones from brain which code for large proteins in vitro. , 1997, DNA research : an international journal for rapid publication of reports on genes and genomes.

[4]  N. Nomura,et al.  Characterization of cDNA clones in size-fractionated cDNA libraries from human brain. , 1997, DNA research : an international journal for rapid publication of reports on genes and genomes.

[5]  N. Nomura,et al.  Construction and characterization of human brain cDNA libraries suitable for analysis of cDNA clones encoding relatively large proteins. , 1997, DNA research : an international journal for rapid publication of reports on genes and genomes.

[6]  Edward T Kipreos,et al.  cul-1 Is Required for Cell Cycle Exit in C. elegans and Identifies a Novel Gene Family , 1996, Cell.

[7]  S. Klauck,et al.  A gene mutated in X–linked myotubular myopathy defines a new putative tyrosine phosphatase family conserved in yeast , 1996, Nature Genetics.

[8]  C. Zheng,et al.  CNS Gene Encoding Astrotactin, Which Supports Neuronal Migration Along Glial Fibers , 1996, Science.

[9]  E. Maestrini,et al.  A family of transmembrane proteins with homology to the MET-hepatocyte growth factor receptor. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[10]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. VI. The coding sequences of 80 new genes (KIAA0201-KIAA0280) deduced by analysis of cDNA clones from cell line KG-1 and brain. , 1996, DNA research : an international journal for rapid publication of reports on genes and genomes.

[11]  L. Zon,et al.  Molecular cloning of a cDNA with a novel domain present in the tre-2 oncogene and the yeast cell cycle regulators BUB2 and cdc16. , 1995, Oncogene.

[12]  M. Scheffner,et al.  A family of proteins structurally and functionally related to the E6-AP ubiquitin-protein ligase. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[13]  N. Miyajima,et al.  Prediction of the coding sequences of unidentified human genes. III. The coding sequences of 40 new genes (KIAA0081-KIAA0120) deduced by analysis of cDNA clones from human cell line KG-1. , 1995, DNA research : an international journal for rapid publication of reports on genes and genomes.

[14]  S. Wasserman,et al.  Diaphanous is required for cytokinesis in Drosophila and shares domains of similarity with the products of the limb deformity gene. , 1994, Development.

[15]  T. Rabbitts,et al.  The LIM domain: a new structural motif found in zinc-finger-like proteins. , 1994, Trends in genetics : TIG.

[16]  F. Fuller-Pace,et al.  RNA helicases: modulators of RNA structure. , 1994, Trends in cell biology.

[17]  T. Fleming,et al.  Expression cDNA cloning of a novel oncogene with sequence similarity to regulators of small GTP-binding proteins. , 1994, Oncogene.

[18]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1 (supplement). , 1994, DNA research : an international journal for rapid publication of reports on genes and genomes.

[19]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. II. The coding sequences of 40 new genes (KIAA0041-KIAA0080) deduced by analysis of cDNA clones from human cell line KG-1. , 1994, DNA research : an international journal for rapid publication of reports on genes and genomes.

[20]  N. Nomura,et al.  Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1. , 1994, DNA research : an international journal for rapid publication of reports on genes and genomes.

[21]  K. Riehemann,et al.  Sequence homologies between four cytoskeleton-associated proteins. , 1993, Trends in biochemical sciences.

[22]  M. Kozak Structural features in eukaryotic mRNAs that modulate the initiation of translation. , 1991, The Journal of biological chemistry.

[23]  M. Simon,et al.  Measurement by quantitative PCR of changes in HPRT, PGK-1, PGK-2, APRT, MTase, and Zfy gene transcripts during mouse spermatogenesis. , 1990, Nucleic acids research.

[24]  J. Devereux,et al.  A comprehensive set of sequence analysis programs for the VAX , 1984, Nucleic Acids Res..

[25]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.