IoT-Based Big Data Storage Systems in Cloud Computing: Perspectives and Challenges

Internet of Things (IoT) related applications have emerged as an important field for both engineers and researchers, reflecting the magnitude and impact of data-related problems to be solved in contemporary business organizations especially in cloud computing. This paper first provides a functional framework that identifies the acquisition, management, processing and mining areas of IoT big data, and several associated technical modules are defined and described in terms of their key characteristics and capabilities. Then current research in IoT application is analyzed, moreover, the challenges and opportunities associated with IoT big data research are identified. We also report a study of critical IoT application publications and research topics based on related academic and industry publications. Finally, some open issues and some typical examples are given under the proposed IoT-related research framework.

[1]  Sonja Meyer,et al.  Internet of Things-Aware Process Modeling: Integrating IoT Devices as Business Process Resources , 2013, CAiSE.

[2]  Bing Li,et al.  Distributed metadata management scheme in cloud computing , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[3]  Huang Bin,et al.  Efficient Metadata Management in Cloud Computing , 2015 .

[4]  J. Shane Culpepper,et al.  Indexing Word Sequences for Ranked Retrieval , 2014, TOIS.

[5]  Dong Guo,et al.  Towards unified heterogeneous event processing for the Internet of Things , 2012, 2012 3rd IEEE International Conference on the Internet of Things.

[6]  Wang-Chien Lee,et al.  A Framework for Personal Mobile Commerce Pattern Mining and Prediction , 2012, IEEE Transactions on Knowledge and Data Engineering.

[7]  Edwin Hsing-Mean Sha,et al.  Reducing the De-linearization of Data Placement to Improve Deduplication Performance , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[8]  Martin Molina,et al.  A tenant-based resource allocation model for scaling Software-as-a-Service applications over cloud computing infrastructures , 2013, Future Gener. Comput. Syst..

[9]  Samuel Williams,et al.  Parallel processing of filtered queries in attributed semantic graphs , 2015, J. Parallel Distributed Comput..

[10]  Umar Raza,et al.  An enterprise service bus (ESB) and Google Gadgets based micro-injection moulding process monitoring system , 2012 .

[11]  Frank Nordemann,et al.  A communication-optimizing middleware for efficient wireless communication in rural environments , 2012, MIDDLEWARE '12.

[12]  Hong Ji,et al.  Layered Fault Management Scheme for End-to-end Transmission in Internet of Things , 2013, Mob. Networks Appl..

[13]  Andrey Somov,et al.  Supporting smart-city mobility with cognitive Internet of Things , 2013, 2013 Future Network & Mobile Summit.

[14]  Dan Wang,et al.  High-performance scheduling model for multisensor gateway of cloud sensor system-based smart-living , 2015, Inf. Fusion.

[15]  Philip S. Yu,et al.  Distributed hoeffding trees for pocket data mining , 2011, 2011 International Conference on High Performance Computing & Simulation.

[16]  Kenji Tei,et al.  ClouT : Cloud of things for empowering the citizen clout in smart cities , 2014, 2014 IEEE World Forum on Internet of Things (WF-IoT).

[17]  Maria Teresa Pazienza,et al.  ART Lab infrastructure for semantic Big Data processing , 2014, 2014 International Conference on High Performance Computing & Simulation (HPCS).

[18]  Wei Jiang,et al.  Design of Real Time Multimedia Platform and Protocol to the Internet of Things , 2012, 2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications.

[19]  Yung-Feng Lu,et al.  A Multi-dimension Hash index design for main-memory RFID database applications , 2012, 2012 International Conference on Information Security and Intelligent Control.

[20]  Ramesh C. Jain,et al.  A Real-time Complex Event Discovery Platform for Cyber-Physical-Social Systems , 2014, ICMR.

[21]  Xu Han,et al.  An efficient index for massive IOT data in cloud environment , 2012, CIKM '12.

[22]  Smriti Pandey,et al.  A Novel Wireless Heterogeneous Data Mining (WHDM) Environment Based on Mobile Computing Environments , 2011, 2011 International Conference on Communication Systems and Network Technologies.

[23]  Andreas Thor,et al.  Load Balancing for MapReduce-based Entity Resolution , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[24]  Domenico Talia,et al.  P2P-MapReduce: Parallel data processing in dynamic Cloud environments , 2012, J. Comput. Syst. Sci..

[25]  Timo Ojala,et al.  CloudThings: A common architecture for integrating the Internet of Things with Cloud Computing , 2013, Proceedings of the 2013 IEEE 17th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[26]  Meng Ma,et al.  Data Management for Internet of Things: Challenges, Approaches and Opportunities , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[27]  Jignesh M. Patel,et al.  A comparison of join algorithms for log processing in MaPreduce , 2010, SIGMOD Conference.

[28]  Albert Y. Zomaya,et al.  Non-intrusive Slot Layering in Hadoop , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[29]  Takahiro Hara,et al.  Main memory database for supporting database migration , 1997, 1997 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM. 10 Years Networking the Pacific Rim, 1987-1997.

[30]  Shou-De Lin,et al.  Exploiting and Evaluating MapReduce for Large-Scale Graph Mining , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[31]  Naixue Xiong,et al.  Dynamic Weight-Based Individual Similarity Calculation for Information Searching in Social Computing , 2017, IEEE Systems Journal.

[32]  P. Venkata Krishna,et al.  Global Trends in Information Systems and Software Applications , 2012, Communications in Computer and Information Science.

[33]  Qing Zhang,et al.  Static and Dynamic Structural Correlations in Graphs , 2013, IEEE Transactions on Knowledge and Data Engineering.

[34]  Olivier Curé,et al.  On the Potential Integration of an Ontology-Based Data Access Approach in NoSQL Stores , 2012, 2012 Third International Conference on Emerging Intelligent Data and Web Technologies.

[35]  Ming-Syan Chen,et al.  Efficient large graph pattern mining for big data in the cloud , 2013, 2013 IEEE International Conference on Big Data.

[36]  Gueyoung Jung,et al.  Synchronous Parallel Processing of Big-Data Analytics Services to Optimize Performance in Federated Clouds , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[37]  José Manuel Cotos,et al.  Heterogeneous sensor data integration for crowdsensing applications , 2014, IDEAS.

[38]  Hai Liu,et al.  A Heterogeneous Data Integration Model , 2013, GRMSE.

[39]  Madhu Goyal,et al.  Multi-tenant Elastic Extension Tables Data Management , 2014, ICCS.

[40]  Wei Wang,et al.  Optimizing the storage of massive electronic pedigrees in HDFS , 2012, 2012 3rd IEEE International Conference on the Internet of Things.

[41]  Ioana Manolescu,et al.  Cloud-based RDF data management , 2014, SIGMOD Conference.

[42]  Yixue Wang,et al.  Efficient metadata management in Cloud Computing , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.

[43]  Juan Carlos Vidal,et al.  Graph-based semantic annotation for enriching educational content with linked data , 2014, Knowl. Based Syst..

[44]  Daeyoung Kim,et al.  Lilliput: Ontology-Based Platform for IoT Social Networks , 2014, 2014 IEEE International Conference on Services Computing.

[45]  Daniel P. Miranker,et al.  Ultrawrap: SPARQL execution on relational data , 2013, J. Web Semant..

[46]  Malgorzata Steinder,et al.  Ripple: Improved Architecture and Programming Model for Bulk Synchronous Parallel Style of Analytics , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[47]  Miguel Ángel Rodríguez-García,et al.  Ontology-based annotation and retrieval of services in the cloud , 2014, Knowl. Based Syst..

[48]  Xuelong Li,et al.  Hessian Regularized Support Vector Machines for Mobile Image Annotation on the Cloud , 2013, IEEE Transactions on Multimedia.

[49]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[50]  Jeffrey D. Ullman,et al.  Optimizing Multiway Joins in a Map-Reduce Environment , 2011, IEEE Transactions on Knowledge and Data Engineering.

[51]  Xu Cui,et al.  MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[52]  Qinghua Zheng,et al.  An optimized approach for storing and accessing small files on cloud storage , 2012, J. Netw. Comput. Appl..

[53]  Lei Xu,et al.  Hub: heterogeneous bucketization for database outsourcing , 2013, Cloud Computing '13.

[54]  Eui-Nam Huh,et al.  Cloud of Things: Integrating Internet of Things and cloud computing and the issues involved , 2014, Proceedings of 2014 11th International Bhurban Conference on Applied Sciences & Technology (IBCAST) Islamabad, Pakistan, 14th - 18th January, 2014.

[55]  Hongming Cai,et al.  IoT-Based Configurable Information Service Platform for Product Lifecycle Management , 2014, IEEE Transactions on Industrial Informatics.

[56]  Anwar Hithnawi,et al.  Poster: come closer: proximity-based authentication for the internet of things , 2014, MobiCom.

[57]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[58]  Kamaljit I. Lakhtaria,et al.  An Efficient Approach for Inverted Index Pruning Based on Document Relevance , 2014, 2014 Fourth International Conference on Communication Systems and Network Technologies.

[59]  Timo Michelsen,et al.  Data stream processing in dynamic and decentralized peer-to-peer networks , 2014, SIGMOD'14 PhD Symposium.

[60]  M. Grund,et al.  Shared Table Access Pattern Analysis for Multi-Tenant Applications , 2008, 2008 IEEE Symposium on Advanced Management of Information for Globalized Enterprises (AMIGE).

[61]  Norbert Martínez-Bazan,et al.  DEX: A high-performance graph database management system , 2011, 2011 IEEE 27th International Conference on Data Engineering Workshops.

[62]  Simon Mayer,et al.  User interfaces for smart things -- A generative approach with semantic interaction descriptions , 2014, TCHI.

[63]  Saswati Mukherjee,et al.  A Dynamic Semantic Metadata Model in Cloud Computing , 2011 .

[64]  Chris Douglas,et al.  Walnut: a unified cloud object store , 2012, SIGMOD Conference.

[65]  Zhen Peng,et al.  Message oriented middleware data processing model in Internet of things , 2012, Proceedings of 2012 2nd International Conference on Computer Science and Network Technology.

[66]  Wei Liu,et al.  A Data-Centric Storage Approach for Efficient Query of Large-Scale Smart Grid , 2012, 2012 Ninth Web Information Systems and Applications Conference.

[67]  Yuan Luo,et al.  Virtualization I/O optimization based on shared memory , 2013, 2013 IEEE International Conference on Big Data.

[68]  Xuehai Zhou,et al.  Wave: Trigger Based Synchronous Data Process System , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[69]  Qi Zhang,et al.  Efficient and Customizable Data Partitioning Framework for Distributed Big RDF Data Processing in the Cloud , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[70]  D. Bujji Babu,et al.  A Modern Cyclic Approach to Solve a Classification Problem in Cloud Environment , 2013, 2013 International Conference on Advanced Computer Science Applications and Technologies.

[71]  Hongming Cai,et al.  An IoT-Oriented Data Storage Framework in Cloud Computing Platform , 2014, IEEE Transactions on Industrial Informatics.

[72]  Sabeur Aridhi,et al.  Density-based data partitioning strategy to approximate large-scale subgraph mining , 2012, Inf. Syst..

[73]  Charalampos Papamanthou,et al.  Dynamic searchable symmetric encryption , 2012, IACR Cryptol. ePrint Arch..

[74]  Rafiqul Haque,et al.  Blinked Data: Concepts, Characteristics, and Challenge , 2014, 2014 IEEE World Congress on Services.

[75]  Christos Doulkeridis,et al.  A survey of large-scale analytical query processing in MapReduce , 2013, The VLDB Journal.

[76]  Antônio Francisco do Prado,et al.  Thing broker: a twitter for things , 2013, UbiComp.

[77]  Fuling Bian,et al.  Geo-Informatics in Resource Management and Sustainable Ecosystem , 2013, Communications in Computer and Information Science.

[78]  Jianjun Yu,et al.  Towards Dynamic Resource Provisioning for Traffic Mining Service Cloud , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[79]  Xiaobo Zhou,et al.  NINEPIN: Non-invasive and energy efficient performance isolation in virtualized servers , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).

[80]  Johan Bollen,et al.  Automatic metadata generation using associative networks , 2008, TOIS.

[81]  Xu Zhengqiao,et al.  Research on Clustering Algorithm for Massive Data Based on Hadoop Platform , 2012, 2012 International Conference on Computer Science and Service System.

[82]  Edward Curry,et al.  Approximate Semantic Matching of Events for the Internet of Things , 2014, ACM Trans. Internet Techn..

[83]  Zhu Zhu,et al.  A Scalable and High-efficiency Discovery Service Using a New Storage , 2013, 2013 IEEE 37th Annual Computer Software and Applications Conference.