Feature model extraction from large collections of informal product descriptions

Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia.

[1]  Jan Bosch,et al.  A taxonomy of variability realization techniques , 2005, Softw. Pract. Exp..

[2]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[3]  Alexander Egyed,et al.  On Extracting Feature Models from Sets of Valid Feature Combinations , 2013, FASE.

[4]  Nan Niu,et al.  Concept analysis for product line requirements , 2009, AOSD '09.

[5]  Mathieu Acher,et al.  On extracting feature models from product descriptions , 2012, VaMoS.

[6]  Krzysztof Czarnecki,et al.  A survey of variability modeling in industrial practice , 2013, VaMoS.

[7]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[8]  Christoph Pohl,et al.  An Exploratory Study of Information Retrieval Techniques in Domain Analysis , 2008, 2008 12th International Software Product Line Conference.

[9]  Krzysztof Czarnecki,et al.  Reverse engineering feature models , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[10]  Jane Cleland-Huang,et al.  On-demand feature recommendations derived from mining public product descriptions , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[11]  Krzysztof Czarnecki,et al.  Cool features and tough decisions: a comparison of variability modeling approaches , 2012, VaMoS.

[12]  Thomas Leich,et al.  FeatureIDE: An extensible framework for feature-oriented software development , 2014, Sci. Comput. Program..

[13]  Robert E. Tarjan,et al.  Finding optimum branchings , 1977, Networks.

[14]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[15]  Sergio Segura,et al.  Automated analysis of feature models 20 years later: A literature review , 2010, Inf. Syst..

[16]  Ruzanna Chitchyan,et al.  A framework for constructing semantically composable feature models from natural language requirements , 2009, SPLC.

[17]  Krzysztof Czarnecki,et al.  Feature Diagrams and Logics: There and Back Again , 2007 .

[18]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[19]  Paul Grünbacher,et al.  The DOPLER meta-tool for decision-oriented variability modeling: a multiple case study , 2011, Automated Software Engineering.

[20]  Mathieu Acher,et al.  FAMILIAR: A domain-specific language for large scale management of feature models , 2013, Sci. Comput. Program..

[21]  Klaus Pohl,et al.  Software Product Line Engineering - Foundations, Principles, and Techniques , 2005 .

[22]  Paulo Borba,et al.  A theory of software product line refinement , 2010, Theor. Comput. Sci..

[23]  Paul Clements,et al.  Software product lines - practices and patterns , 2001, SEI series in software engineering.

[24]  David M. Weiss,et al.  Software Product-Line Engineering: A Family-Based Software Development Process , 1999 .

[25]  Sven Apel,et al.  An Overview of Feature-Oriented Software Development , 2009, J. Object Technol..

[26]  Andreas Classen,et al.  A text-based approach to feature modelling: Syntax and semantics of TVL , 2011, Sci. Comput. Program..

[27]  Krzysztof Czarnecki,et al.  Sample Spaces and Feature Models: There and Back Again , 2008, 2008 12th International Software Product Line Conference.

[28]  Sven Apel,et al.  Feature cohesion in software product lines: an exploratory study , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[29]  Robert L. Nord,et al.  Software Product Lines , 2004, Lecture Notes in Computer Science.

[30]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[31]  Krzysztof Czarnecki,et al.  Efficient synthesis of feature models , 2012, SPLC '12.

[32]  Klaus Kabitzsch,et al.  Extraction of feature models from formal contexts , 2011, SPLC '11.

[33]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[34]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[35]  Martin Verlage,et al.  The Economic Impact of Product Line Adoption and Evolution , 2002, IEEE Softw..

[36]  Pierre-Yves Schobbens,et al.  What ' s in a Feature ? A Requirements Engineering Perspective , 2008 .

[37]  Mathieu Acher,et al.  Support for reverse engineering and maintaining feature models , 2013, VaMoS.

[38]  Haiyan Zhao,et al.  An approach to constructing feature models based on requirements clustering , 2005, 13th IEEE International Conference on Requirements Engineering (RE'05).