Towards an MDE-based approach to test entity reconciliation applications

The management of large volumes of data has given rise to significant challenges to the entity reconciliation problem (which refers to combining data from different sources for a unified vision) due to the fact that the data are becoming more unstructured, unclean and incomplete, need to be more linked, etc. Testing the applications that implement the entity reconciliation problem is crucial to ensure both the correctness of the reconciliation process and the quality of the reconciled data. In this paper, we present a first approach, based on MDE, which allows the creation of test models for the integration testing of entity reconciliation applications.

[1]  Claudia Niederée,et al.  On-the-fly entity-aware query processing in the presence of linkage , 2010, Proc. VLDB Endow..

[2]  María José Escalona Cuaresma,et al.  Entity Identity Reconciliation based Big Data Federation-A MDE approach , 2015, ISD.

[3]  Dennis Shasha,et al.  Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.

[4]  Ashwin Machanavajjhala,et al.  Network sampling , 2013, KDD.

[5]  Ashwin Machanavajjhala,et al.  Entity Resolution: Theory, Practice & Open Challenges , 2012, Proc. VLDB Endow..

[6]  Suzanne M. Embury,et al.  Testing the Implementation of Business Rules Using Intensional Database Tests , 2006, Testing: Academic & Industrial Conference - Practice And Research Techniques (TAIC PART'06).

[7]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[8]  William W. Cohen,et al.  Learning to match and cluster large high-dimensional data sets for data integration , 2002, KDD.

[9]  Hongzhi Wang,et al.  Graph-based reference table construction to facilitate entity matching , 2013, Journal of Systems and Software.

[10]  Jean Bézivin,et al.  On the unification power of models , 2005, Software & Systems Modeling.

[11]  George V. Moustakides,et al.  A Bayesian decision model for cost optimal record matching , 2003, The VLDB Journal.

[12]  Javier Tuya,et al.  Test Adequacy Evaluation for the User-database Interaction: A Specification-Based Approach , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[13]  William E. Winkler,et al.  Methods for Record Linkage and Bayesian Networks , 2002 .

[14]  Hsinchun Chen,et al.  Visualization of large category map for Internet browsing , 2003, Decis. Support Syst..

[15]  Lise Getoor,et al.  A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.

[16]  L. Getoor,et al.  A Latent Dirichlet Allocation Model for Entity Resolution , 2005 .

[17]  Amol Deshpande,et al.  Managing large dynamic graphs efficiently , 2012, SIGMOD Conference.

[18]  Jianzhong Li,et al.  Efficient entity resolution based on subgraph cohesion , 2015, Knowledge and Information Systems.

[19]  P. Ivax,et al.  A THEORY FOR RECORD LINKAGE , 2004 .

[20]  Avigdor Gal Uncertain entity resolution: re-evaluating entity resolution in the big data era: tutorial , 2014, VLDB 2014.

[21]  Anuradha Bhamidipaty,et al.  Interactive deduplication using active learning , 2002, KDD.

[22]  Avigdor Gal Tutorial: Uncertain Entity Resolution , 2014, Proc. VLDB Endow..

[23]  Javier Tuya,et al.  Full predicate coverage for testing SQL database queries , 2010 .

[24]  John Joseph Chilenski,et al.  An Investigation of Three Forms of the Modified Condition Decision Coverage (MCDC) Criterion , 2001 .

[25]  Lionel C. Briand,et al.  Guest Editor's Introduction , 2004, Empirical Software Engineering.

[26]  Douglas C. Schmidt,et al.  Guest Editor's Introduction: Model-Driven Engineering , 2006, Computer.

[27]  Javier Tuya,et al.  Full predicate coverage for testing SQL database queries , 2010, Softw. Test. Verification Reliab..