Reverse engineering web configurators

A Web configurator offers a highly interactive environment to assist users in customising sales products through the selection of configuration options. Our previous empirical study revealed that a significant number of configurators are suboptimal in reliability, efficiency, and maintainability, opening avenues for re-engineering support and methodologies. This paper presents a tool-supported reverse-engineering process to semi-automatically extract configuration-specific data from a legacy Web configurator. The extracted and structured data is stored in formal models (e.g., variability models) and can be used in a forward-engineering process to generate a customized interface with an underlying reliable reasoning engine. Two major components are presented: (1) a Web Wrapper that extracts structured configuration-specific data from unstructured or semistructured Web pages of a configurator, and (2) a Web Crawler that explores the “configuration space” (i.e., all objects representing configuration-specific data) and simulates users' configuration actions. We describe variability data extraction patterns, used on top of the Wrapper and the Crawler to extract configuration data. Experimental results on five existing Web configurators show that the specification of a few variability patterns enable the identification of hundreds of options.

[1]  Khaled Shaalan,et al.  A Survey of Web Information Extraction Systems , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  F. Piller,et al.  - 1-PRODUCT CONFIGURATION FROM THE CUSTOMER ’ S PERSPECTIVE : A COMPARISON OF CONFIGURATION SYSTEMS IN THE APPAREL INDUSTRY , 2004 .

[3]  Mathieu Acher,et al.  Feature model extraction from large collections of informal product descriptions , 2013, ESEC/FSE 2013.

[4]  Giuseppe A. Di Lucca,et al.  Reverse engineering Web applications: the WARE approach , 2004, J. Softw. Maintenance Res. Pract..

[5]  Robert L. Grossman,et al.  Mining data records in Web pages , 2003, KDD '03.

[6]  Andreas Classen,et al.  A text-based approach to feature modelling: Syntax and semantics of TVL , 2011, Sci. Comput. Program..

[7]  Krzysztof Czarnecki,et al.  Generating range fixes for software configuration , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[8]  Hector Garcia-Molina,et al.  Extracting structured data from Web pages , 2003, SIGMOD '03.

[9]  Jacques Klein,et al.  Towards automated testing and fixing of re-engineered Feature Models , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[10]  Alberto H. F. Laender,et al.  DEByE - Data Extraction By Example , 2002, Data Knowl. Eng..

[11]  Mathieu Acher,et al.  FAMILIAR: A domain-specific language for large scale management of feature models , 2013, Sci. Comput. Program..

[12]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[13]  Mathieu Acher,et al.  Extraction and evolution of architectural variability models in plugin-based systems , 2013, Software & Systems Modeling.

[14]  Arie van Deursen,et al.  Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes , 2012, TWEB.

[15]  Krzysztof Czarnecki,et al.  Efficient synthesis of feature models , 2012, SPLC '12.

[16]  Krzysztof Czarnecki,et al.  Reverse engineering feature models , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[17]  Tewfik Ziadi,et al.  Feature Identification from the Source Code of Product Variants , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[18]  Christoph Pohl,et al.  An Exploratory Study of Information Retrieval Techniques in Domain Analysis , 2008, 2008 12th International Software Product Line Conference.

[19]  Mathieu Acher,et al.  The Anatomy of a Sales Configurator: An Empirical Study of 111 Cases , 2013, CAiSE.

[20]  Sergio Segura,et al.  Automated analysis of feature models 20 years later: A literature review , 2010, Inf. Syst..

[21]  Mathieu Acher,et al.  Towards more reliable configurators: A re-engineering perspective , 2012, 2012 Third International Workshop on Product LinE Approaches in Software Engineering (PLEASE).

[22]  Atif M. Memon,et al.  GUITAR: an innovative tool for automated testing of GUI-driven software , 2014, Automated Software Engineering.

[23]  Cipriano Forza,et al.  Sales Configurator Capabilities to Prevent Product Variety From Backfiring , 2012, Configuration Workshop.

[24]  Jean Vanderdonckt,et al.  Flexible reverse engineering of web pages with VAQUISTA , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[25]  Alexander Egyed,et al.  On Extracting Feature Models from Sets of Valid Feature Combinations , 2013, FASE.