Applying MapReduce principle to high level information fusion

The InSyTo Synthesis framework is based on graph structures, graph algorithms and similarity measures for soft data fusion managing inconsistencies. The framework can be used to enable non-redundant additions to an information network, as well as graph based information query on several applications. The graph fusion algorithm relies on the search of a maximum common subgraph isomorphism, which makes it a difficult problem, especially on large graphs. In this work, the subgraph matching algorithm is partially parallelized, based on the MapReduce approach and on the Hadoop framework. Using Hadoop enables the management of big graphs, first by avoiding the load of the graphs in memory and secondly by distributing the computations over several processing nodes. Our experiments on the Global Terrorism Database (which contains the descriptions of more than 113,000 terrorist attacks in a graph of more than 20,000,000 nodes) shows that InSyTo Synthesis now scales to so-called "big data" applications.

[1]  Stuart C. Shapiro,et al.  Towards hard+soft data fusion: Processing architecture and implementation for the joint fusion and analysis of hard and soft intelligence data , 2012, 2012 15th International Conference on Information Fusion.

[2]  Peter Willett,et al.  Maximum common subgraph isomorphism algorithms for the matching of chemical structures , 2002, J. Comput. Aided Mol. Des..

[3]  Todd Plantenga,et al.  Inexact subgraph isomorphism in MapReduce , 2013, J. Parallel Distributed Comput..

[4]  Claire Laudy Semantic Knowledge Representations for Soft Data Fusion , 2011 .

[5]  Nadya Belov,et al.  Mixed initiative soft data fusion associate , 2009, 2009 12th International Conference on Information Fusion.

[6]  Jan Richter,et al.  Implementing soft fusion , 2013, Proceedings of the 16th International Conference on Information Fusion.

[7]  James Llinas,et al.  A Multi-Disciplinary University Research Initiative in Hard and Soft information fusion: Overview, research strategies and initial results , 2010, 2010 13th International Conference on Information Fusion.

[8]  Jean-Gabriel Ganascia,et al.  Introducing semantic knowledge in high-level fusion , 2009, MILCOM 2009 - 2009 IEEE Military Communications Conference.

[9]  Gaëlle Lortal,et al.  Multi-granular fusion for social data analysis for a decision and intelligence application , 2013, Proceedings of the 16th International Conference on Information Fusion.

[10]  Claire Laudy,et al.  Managing uncertainty in conceptual graph-based soft information fusion , 2013, Proceedings of the 16th International Conference on Information Fusion.

[11]  Madhav V. Marathe,et al.  SAHAD: Subgraph Analysis in Massive Networks Using Hadoop , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[12]  Huajun Chen,et al.  MapReduce-Based Pattern Finding Algorithm Applied in Motif Detection for Prescription Compatibility Network , 2009, APPT.

[13]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[14]  Mario Vento,et al.  Challenging Complexity of Maximum Common Subgraph Detection Algorithms: A Performance Analysis of Three Algorithms on a Wide Database of Graphs , 2007, J. Graph Algorithms Appl..

[15]  Marie-Laure Mugnier,et al.  Graph-based Knowledge Representation - Computational Foundations of Conceptual Graphs , 2008, Advanced Information and Knowledge Processing.

[16]  Christine Solnon,et al.  CP Models for Maximum Common Subgraph Problems , 2011, CP.

[17]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[18]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .

[19]  Sushant S. Khopkar,et al.  Data association and graph analytical processing of hard and soft intelligence data , 2013, Proceedings of the 16th International Conference on Information Fusion.

[20]  Horst Bunke,et al.  A Comparison of Algorithms for Maximum Common Subgraph on Randomly Connected Graphs , 2002, SSPR/SPR.

[21]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.