Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM.

The life processes of an eukaryotic cell are guided by its nucleus. In addition to the genetic material, the cellular nucleus contains many proteins located at its different compartments, called subnuclear locations. Information of their localization in a nucleus is indispensable for the in-depth study of system biology because, in addition to helping determine their functions, it can provide illuminative insights of how and in what kind of microenvironments these subnuclear proteins are interacting with each other and with other molecules. Facing the deluge of protein sequences generated in the post-genomic age, we are challenged to develop an automated method for fast and effectively annotating the subnuclear locations of numerous newly found nuclear protein sequences. In view of this, a new classifier, called Nuc-PLoc, has been developed that can be used to identify nuclear proteins among the following nine subnuclear locations: (1) chromatin, (2) heterochromatin, (3) nuclear envelope, (4) nuclear matrix, (5) nuclear pore complex, (6) nuclear speckle, (7) nucleolus, (8) nucleoplasm and (9) nuclear promyelocytic leukaemia (PML) body. Nuc-PLoc is featured by an ensemble classifier formed by fusing the evolution information of a protein and its pseudo-amino acid composition. The overall jackknife cross-validation accuracy obtained by Nuc-PLoc is significantly higher than those by the existing methods on the same benchmark data set through the same testing procedure. As a user-friendly web-server, Nuc-PLoc is freely accessible to the public at http://chou.med.harvard.edu/bioinf/Nuc-PLoc.