A Filter-Wrapper based Feature Selection for Optimized Website Quality Prediction

A Website quality model essentially consists of a set of criteria used to determine if a website reaches certain levels of fineness. Quality attributes are imperative in predicting quality of a website. However large set features may increase model complexity and computational time for prediction. In this study an initial set of 13 quality attributes is prepared and two-stage feature selection based on hybrid filter-wrapper approach is used to optimize website quality prediction performance. In the first stage, each feature is ranked using the Information Gain (IG) method. In the second stage, Firefly Search Algorithm (FA) is applied to features which are ranked higher in the first stage and a feature sub-selection is carried out. In order to evaluate the effectiveness of feature selection method experiments are conducted using four baseline classification algorithms on a dataset prepared by 700 websites. The experimental results show that the proposed model is able to achieve high classification performance. An average of 36.53% reduction in features has been observed which leads to 0.015 seconds of average reduction in time for building a classifier and an average improvement of 5.420% in classification accuracy.

[1]  Marie‐Christine Lichtlé,et al.  The effect of an advertisement’s colour on emotions evoked by attitude towards the ad , 2007 .

[2]  Antonella De Angeli,et al.  Framing the user experience: information biases on website quality judgement , 2008, CHI.

[3]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[4]  Petra Schubert,et al.  Web assessment-a model for the evaluation and the assessment of successful electronic commerce applications , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[5]  Nathalie Bonnardel,et al.  The impact of colour on Website appeal and users' cognitive processes , 2011, Displays.

[6]  Arunima Jaiswal,et al.  Empirical Study of Twitter and Tumblr for Sentiment Analysis using Soft Computing Techniques , 2022 .

[7]  Petra Schubert,et al.  Web assessment-measuring the effectiveness of electronic commerce sites going beyond traditional marketing paradigms , 1999, Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers.

[8]  Barbara S. Chaparro,et al.  So, What Size and Type of Font Should I Use on My Website? , 2000 .

[9]  Jianchu Kang,et al.  A comparative study on unsupervised feature selection methods for text clustering , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[10]  K. Jacobs,et al.  Effects of Four Psychological Primary Colors on GSR, Heart Rate and Respiration Rate , 1974, Perceptual and motor skills.

[11]  Amrita,et al.  Performance Analysis Of Different Feature Selection Methods In Intrusion Detection , 2013 .

[12]  Hema Banati,et al.  Fire Fly Based Feature Selection Approach , 2011 .

[13]  Akshi Kumar,et al.  Rumor Detection Using Machine Learning Techniques on Social Media , 2018, International Conference on Innovative Computing and Communications.

[14]  Dianne Cyr,et al.  Colour appeal in website design within and across cultures: A multi-method evaluation , 2010, Int. J. Hum. Comput. Stud..

[15]  Petra Schubert,et al.  Extended Web Assessment Method (EWAM) - evaluation of e-commerce applications from the customer's viewpoint , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[16]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[17]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[18]  Kim Normann Andersen,et al.  Assessment of Website Quality: Scandinavian Web Awards Right on Track? , 2009, EGOV.