A machine learning approach for detecting MAP kinase in the genome of Oryza sativa L. ssp. indica

Plant development and crop yield are highly influenced by temperature. High temperature negatively affects different stages of plant development in rice, mainly booting and flowering. Identifying candidate genes associated with high-temperature stress response may provide knowledge for the improvement of heat tolerance in rice. As the rice genome sequencing has already been undertaken, a major work challenge is annotating proteins and decoding their functionalities. MAP kinase (MAPK) proteins are involved in signaling various abiotic and biotic stresses, like temperature stress or drought, wounding and pathogen infection. Moreover, MAPKs have also been implicated in cell cycle and developmental processes. In this study, an attempt has been made in developing a MAP kinase prediction tool for rice, MapPred. The computational approach has been developed using Sequential Minimum Optimization (SMO) algorithm in Weka workbench, and a sensitivity of 100% was obtained using dipeptide method. MapPred was also tested with three plants, namely Arabidopsis, maize and tomato to prove that developed tool has higher accuracy with rice than other plants which further proves the higher prediction accuracy of species-specific tools. Prediction performance of MapPred was evaluated using cross validation, independent data test and leave one out validation. Our experimental results demonstrated that proposed algorithm based on dipeptide method could be very effective in the computational approach for predicting MAPK proteins in Oryza sativasubsp.indica.

[1]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[2]  T. Satake,et al.  High temperature-induced sterility in indica rices at flowering , 1978 .

[3]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[4]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[5]  H. H. Laar,et al.  The rice model ORYZA1 and its testing. , 1995 .

[6]  J. Goudriaan,et al.  Differential Effects of Day and Night Temperature on Development to Flowering in Rice , 1996 .

[7]  J. Lobry,et al.  Influence of genomic G+C content on average amino-acid composition of proteins from 59 bacterial species. , 1997, Gene.

[8]  B. Rost,et al.  Adaptation of protein surfaces to subcellular location. , 1998, Journal of molecular biology.

[9]  H. Hirt,et al.  Plant MAP kinase pathways: how many and what for? , 2001, Biology of the cell.

[10]  Fredj Tekaia,et al.  Amino acid composition of genomes, lifestyles of organisms, and evolutionary trends: a global picture with correspondence analysis. , 2002, Gene.

[11]  Huawu Jiang,et al.  Effect of high temperature on fine structure of amylopectin in rice endosperm by reducing the activity of the starch branching enzyme. , 2003, Phytochemistry.

[12]  B. Vinatzer,et al.  Bioinformatics correctly identifies many type III secretion substrates in the plant pathogen Pseudomonas syringae and the biocontrol isolate P. fluorescens SBW25. , 2005, Molecular plant-microbe interactions : MPMI.

[13]  S. Morita,et al.  Grain growth and endosperm cell size under high night temperatures in rice (Oryza sativa L.). , 2005, Annals of botany.

[14]  E. Yeramian,et al.  Evolution of proteomes: fundamental signatures and global trends in amino acid compositions , 2006, BMC Genomics.

[15]  L. M. Schechter,et al.  Multiple approaches to a complete inventory of Pseudomonas syringae pv. tomato DC3000 type III secretion system effector proteins. , 2006, Molecular plant-microbe interactions : MPMI.

[16]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[17]  R. L. Williams,et al.  Genotypic variation for cold tolerance during reproductive development in rice : Screening with cold air and cold water , 2006 .

[18]  Oxana V. Galzitskaya,et al.  Trend of Amino Acid Composition of Proteins of Different Taxa , 2006, J. Bioinform. Comput. Biol..

[19]  A. Wahid,et al.  Heat tolerance in plants: An overview , 2007 .

[20]  T. Kume,et al.  A general adaptation strategy for climate change impacts on paddy cultivation: special reference to the Japanese context , 2009, Paddy and Water Environment.

[21]  Takeshi Nagai,et al.  Differences Between Rice and Wheat in Temperature Responses of Photosynthesis and Plant Growth , 2009, Plant & cell physiology.

[22]  Fan Yang,et al.  Influence of high temperature during grain filling on the accumulation of storage proteins and grain quality in rice (Oryza sativa L.). , 2010, Journal of agricultural and food chemistry.