With theefforts tounderstand protein structure, many computational approaches havebeenmade recently. Amongthem,thesupport vector machine (SVM)methods have beenrecently applied and showedsuccessful performance comparedwithothermachinelearning schemes.However, despite thehighperformance, theSVM approaches suffer from theproblem ofunderstandability since itisablack-box model. Toovercome this limitation, this study attempted tocombine the SVM withtheassociation rulebasedclassifier whichcanpresent themeaningful explanation abouttheprediction. Toperform this task, anewassociation rulebasedclassifier (PCPAR)wasdevised basedontheexisting classifier, CPAR,tohandle thesequential data. PCPARcreates thepatterns bymerging thegenerated rules andthenclassifies thesequential databased onthepattern match. Theexperimental result presents thefollowing: withsequential data, thePCPARschemeshowsbetter performance withrespect totheaccuracy andthenumberofgenerated patterns thanCPAR methodwhetherapplied aloneorcombined withSVM. The combined schemeofSVM_PCPAR generates morecompact patterns thanthecombined schemeofSVM withdecision tree, SVM_DT,withsimilar performance. Thesepatterns areeasily understandable andbiologically meaningful. IndexTerms-support vector machine, association rulebased classifier, decision tree, CPAR,PCPAR
[1]
Corinna Cortes,et al.
Support-Vector Networks
,
1995,
Machine Learning.
[2]
R. Mike Cameron-Jones,et al.
FOIL: A Midterm Report
,
1993,
ECML.
[3]
A. Kernytsky,et al.
Transmembrane helix predictions revisited
,
2002,
Protein science : a publication of the Protein Society.
[4]
Tomasz Imielinski,et al.
Database Mining: A Performance Perspective
,
1993,
IEEE Trans. Knowl. Data Eng..
[5]
S. Hua,et al.
A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach.
,
2001,
Journal of molecular biology.
[6]
Yi Pan,et al.
Transmembrane segments prediction and understanding using support vector machine and decision tree
,
2006,
Expert Syst. Appl..
[7]
Joachim Diederich,et al.
Learning-Based Rule-Extraction From Support Vector Machines: Performance On Benchmark Data Sets
,
2004
.
[8]
R. Doolittle,et al.
A simple method for displaying the hydropathic character of a protein.
,
1982,
Journal of molecular biology.