Expediting protein structural analysis with an efficient kernel density estimation algorithm

We describe a kernel density estimation based mechanism aimed at expediting protein structural analysis. We have been motivated by the observation that many protein structural analysis algorithms suffer high time complexity, while only the residues and atoms on the contour of a protein are essential for determining the functions of the protein and how it could interact with the other proteins. Accordingly, for some protein structural analysis problems, it is desirable to invoke a mechanism that can extract the residues and atoms on the contour of a protein in order to expedite the analysis process. The conventional approach to carry out this task is to invoke the /spl alpha/-hull algorithm from computer graphics, which features O(n/sup 2/) time complexity, where n is the number of residues or atoms in the protein. A kernel density estimation based expediting mechanism with an average time complexity of O(nlogn) is proposed. We also report the experiment conducted to evaluate the effects of applying the proposed expediting mechanism to a real protein structural analysis problem. Experimental results reveal that a speedup of 4.8 to 10.3 times can be achieved with minimum impact on the analysis accuracy.

[1]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[2]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[3]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[4]  Yen-Jen Oyang,et al.  An efficient learning algorithm for function approximation with radial basis function networks , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[5]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[6]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[7]  L. Gordon,et al.  The Gamma Function , 1994, Series and Products in the Development of Mathematics.

[8]  David A. Fenstermacher,et al.  Introduction to bioinformatics , 2005, J. Assoc. Inf. Sci. Technol..

[9]  Herbert Edelsbrunner,et al.  Three-dimensional alpha shapes , 1994, ACM Trans. Graph..

[10]  Yen-Jen Oyang,et al.  A novel learning algorithm for data classification with radial basis function networks , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[11]  D. Krane,et al.  Fundamental Concepts of Bioinformatics , 2002 .

[12]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[13]  S. Altschul Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.

[14]  Haim J. Wolfson,et al.  Geometric hashing: an overview , 1997 .