Automated Functional Dependency Detection Between Test Cases Using Doc2Vec and Clustering

Knowing about dependencies and similarities between test cases is beneficial for prioritizing them for cost-effective test execution. This holds especially true for the time consuming, manual execution of integration test cases written in natural language. Test case dependencies are typically derived from requirements and design artifacts. However, such artifacts are not always available, and the derivation process can be very time-consuming. In this paper, we propose, apply and evaluate a novel approach that derives test cases' similarities and functional dependencies directly from the test specification documents written in natural language, without requiring any other data source. Our approach uses an implementation of Doc2Vec algorithm to detect text-semantic similarities between test cases and then groups them using two clustering algorithms HDBSCAN and FCM. The correlation between test case text-semantic similarities and their functional dependencies is evaluated in the context of an on-board train control system from Bombardier Transportation AB in Sweden. For this system, the dependencies between the test cases were previously derived and are compared to the results our approach. The results show that of the two evaluated clustering algorithms, HDBSCAN has better performance than FCM or a dummy classifier. The classification methods' results are of reasonable quality and especially useful from an industrial point of view. Finally, performing a random undersampling approach to correct the imbalanced data distribution results in an F1 Score of up to 75% when applying the HDBSCAN clustering algorithm.

[1]  Wasif Afzal,et al.  Functional Dependency Detection for Integration Test Cases , 2018, 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C).

[2]  Mohamed Ali Hadj Taieb,et al.  FM3S: Features-Based Measure of Sentences Semantic Similarity , 2015, HAIS.

[3]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[4]  Mamoru Mimura,et al.  Reading Network Packets as a Natural Language for Intrusion Detection , 2017, ICISC.

[5]  Daniel Sundmark,et al.  Dynamic Integration Test Selection Based on Test Case Dependencies , 2016, 2016 IEEE Ninth International Conference on Software Testing, Verification and Validation Workshops (ICSTW).

[6]  Arup Abhinna Acharya,et al.  Model Based Test Case Prioritization For Testing Component Dependency In CBSD Using UML Sequence Diagram , 2010 .

[7]  Soo-Hyun Park,et al.  Measuring Semantic Similarity Based on Weighting Attributes of Edge Counting , 2004, AIS.

[8]  Gregg Rothermel,et al.  Selecting tests and identifying test coverage requirements for modified software , 1994, ISSTA '94.

[9]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[10]  Christian Biemann,et al.  Making Sense of Word Embeddings , 2016, Rep4NLP@ACL.

[11]  Minh-Triet Tran,et al.  News Classification from Social Media Using Twitter-based Doc2Vec Model and Automatic Query Expansion , 2017, SoICT.

[12]  Hadi Hemmati,et al.  Investigating NLP-Based Approaches for Predicting Manual Test Case Failure , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[13]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.

[14]  Daniel Sundmark,et al.  Similarity-based prioritization of test case automation , 2017, Software Quality Journal.

[15]  Ahmed E. Hassan,et al.  Static test case prioritization using topic models , 2014, Empirical Software Engineering.

[16]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[17]  Mehrdad Saadatmand,et al.  Multi-Criteria Test Case Prioritization Using Fuzzy Analytic Hierarchy Process , 2015, ICSEA 2015.

[18]  Vipin Kumar,et al.  The Challenges of Clustering High Dimensional Data , 2004 .

[19]  Farzad Didehvar,et al.  Clustering validity based on the most similarity , 2013, ArXiv.

[20]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[21]  Leland McInnes,et al.  hdbscan: Hierarchical density based clustering , 2017, J. Open Source Softw..

[22]  Wasif Afzal,et al.  Software test process improvement approaches: A systematic literature review and an industrial case study , 2016, J. Syst. Softw..

[23]  Fernando De la Torre,et al.  Facing Imbalanced Data--Recommendations for the Use of Performance Metrics , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[24]  Wei Lu,et al.  A hybrid approach for measuring semantic similarity based on IC-weighted path distance in WordNet , 2017, Journal of Intelligent Information Systems.

[25]  Wasi Haider Butt,et al.  A comprehensive investigation of natural language processing techniques and tools to generate automated test cases , 2017, ICC.

[26]  Martin Glinz,et al.  Using Dependency Charts to Improve Scenario-Based Testing , 2000 .

[27]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[28]  Susan Horwitz,et al.  Incremental program testing using program dependence graphs , 1993, POPL '93.

[29]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[30]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[31]  Tony Gorschek,et al.  Large-scale information retrieval in software engineering - an experience report from industrial application , 2016, Empirical Software Engineering.

[32]  Andreas Podelski,et al.  If A Fails, Can B Still Succeed? Inferring Dependencies between Test Results in Automotive System Testing , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[33]  Daniel Sundmark,et al.  Cost-Benefit Analysis of Using Dependency Knowledge at Integration Testing , 2016, PROFES.

[34]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[35]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[36]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[37]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[38]  Yu Hirate,et al.  Distributed Representation-based Recommender Systems in E-commerce , 2016 .

[39]  Tim Miller,et al.  Using Dependency Structures for Prioritization of Functional Test Suites , 2013, IEEE Transactions on Software Engineering.