Framework of integrated big data: A review

Currently, how to deeply distill potential attributes of big data has become a great challenge for structured, semi-structured and unstructured data (SSU data) with a unified model. Structured data refers to any data that resides in a fixed field within a record or file including data contained in relational databases and spreadsheets. Unstructured data refers to data from text, pictures, audio, video, and other sources that do not fit into a relational database. Semi-structured data is information that doesn't reside in a relational database but that does have some organizational properties that make it easier to analyze, such as XML, and HTML documents. In this paper, we present a literature survey and a framework, namely integrated big data (IBD), which aims at exploring the approaches for constructing a universal IBD model, including representation, storage and management, computation, and visual analysis. Firstly, we present a systematic framework to decompose big data analytics into four modules. Next, we present a detailed survey of numerous approaches for these four modules. The main contributions of this paper are summarized in two dimensions. First, we propose a novel integrated big data framework for unified big data representation, storage, computation, and visual analysis. Second, we present the possible future methods in realizing the framework by reviewing methods. Through this paper, we would like to point out a promising research direction in unified investigation and application of big data.

[1]  Helwig Hauser,et al.  Visualization and Visual Analysis of Multifaceted Scientific Data: A Survey , 2013, IEEE Transactions on Visualization and Computer Graphics.

[2]  Bhuvana Ramabhadran,et al.  Deep belief nets for natural language call-routing , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  WangWei,et al.  Effective multi-modal retrieval based on stacked auto-encoders , 2014, VLDB 2014.

[4]  Hui Liu,et al.  Adaptively incremental dictionary compression method for column-oriented database , 2014, 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[5]  V. Marx Biology: The big challenges of big data , 2013, Nature.

[6]  Huilong Duan,et al.  An optimized framework for integrated visualization of distributed medical images , 2012, 2012 5th International Conference on BioMedical Engineering and Informatics.

[7]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[8]  Laurence T. Yang,et al.  A Tensor-Based Approach for Big Data Representation and Dimensionality Reduction , 2014, IEEE Transactions on Emerging Topics in Computing.

[9]  Zhen Lin,et al.  Implementation and evaluation of deep neural networks (DNN) on mainstream heterogeneous systems , 2014, APSys.

[10]  Andrzej Cichocki,et al.  Era of Big Data Processing: A New Approach via Tensor Networks and Tensor Decompositions , 2014, ArXiv.

[11]  Bingru Yang,et al.  Graph-based text representation model and its realization , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[12]  Jose M. F. Moura,et al.  Representation and processing of massive data sets with irregular structure ] Big Data Analysis with Signal Processing on Graphs , 2022 .

[13]  Ge Yu,et al.  Survey on NoSQL for Management of Big Data: Survey on NoSQL for Management of Big Data , 2014 .

[14]  Yang-Sae Moon,et al.  Horizontal Reduction: Instance-Level Dimensionality Reduction for Similarity Search in Large Document Databases , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[17]  Lei Ren,et al.  Visual Analytics Towards Big Data , 2014 .

[18]  Mathieu Domingo A graph-based model for the representation of land spaces , 2013, SIGSPATIAL/GIS.

[19]  Thomas Maugey,et al.  Graph-based vs depth-based data representation for multiview images , 2013, 2013 Asilomar Conference on Signals, Systems and Computers.

[20]  Huilong Duan,et al.  Integrated Visualization of Multi-Modal Electronic Health Record Data , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[21]  Vijay V. Raghavan,et al.  NoSQL Systems for Big Data Management , 2014, 2014 IEEE World Congress on Services.

[22]  Hans-Peter Kriegel,et al.  Visualization Techniques for Mining Large Databases: A Comparison , 1996, IEEE Trans. Knowl. Data Eng..

[23]  Philip A. Bernstein,et al.  Mapping XML to a Wide Sparse Table , 2014, IEEE Transactions on Knowledge and Data Engineering.

[24]  José M. F. Moura,et al.  Big Data Analysis with Signal Processing on Graphs: Representation and processing of massive data sets with irregular structure , 2014, IEEE Signal Processing Magazine.

[25]  Shen De,et al.  Survey on NoSQL for Management of Big Data , 2013 .

[26]  Valerie Daggett,et al.  DIVE: A Graph-Based Visual-Analytics Framework for Big Data , 2014, IEEE Computer Graphics and Applications.

[27]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[28]  Felice C. Frankel,et al.  Big data: Distilling meaning from data , 2008, Nature.

[29]  Jianzhong Li,et al.  Efficient Skyline Computation on Big Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[30]  Ce Liu,et al.  Deep Convolutional Neural Network for Image Deconvolution , 2014, NIPS.

[31]  Michele Risi,et al.  CoDe Modeling of Graph Composition for Data Warehouse Report Visualization , 2014, IEEE Transactions on Knowledge and Data Engineering.

[32]  Rong Xiong,et al.  Towards learning from demonstration system for parts assembly: A graph based representation for knowledge , 2014, The 4th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent.

[33]  Yike Guo,et al.  BigData Oriented Open Scalable Relational Data Model , 2014, 2014 IEEE International Congress on Big Data.

[34]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[35]  Aoying Zhou,et al.  Using Wide Table to manage web data: a survey , 2008, Frontiers of Computer Science in China.