Understanding Performance of Protein Structural Classifiers

Many bioinformatics applications utilize machine learning techniques to create models for predicting which parts of proteins will bind to targets. Understanding the results of these protein surface binding classifiers is challenging, as the individual answers are embedded spatially on the surface of the molecules, yet the performance needs to be understood over an entire corpus of molecules. In this project, we introduce a multi-scale approach for assessing the performance of these structural classifiers, providing coordinated views for both corpus level overviews as well as spatiallyembedded results on the three-dimensional structures of proteins.

[1]  F M Richards,et al.  Areas, volumes, packing and protein structure. , 1977, Annual review of biophysics and bioengineering.

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[4]  Cynthia A. Brewer,et al.  ColorBrewer.org: An Online Tool for Selecting Colour Schemes for Maps , 2003 .

[5]  Ethem Alpaydin,et al.  Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Steven Franconeri,et al.  Comparing averages in time series data , 2012, CHI.