Big Data Computing and Communications

In this paper, we target at similarity search among data supply chains, which plays essential role in optimizing the chain and extending its value. This problem is very challenging for application-oriented data supply chains because the high complexity of data supply chain makes the computation of similarity extremely complex and inefficiency. In this paper, we propose a feature space representation model based on key points, which can extract the key features from sub-sequences of the original data supply chain and simplify the original data supply chain into a feature vector form. Then, we formulate the similarity computation of key points based on the multi-scale features. Further, we propose an improved hierarchical clustering algorithm for similarity search over data supply chains. The main idea is to separate sub-sequences into disjoint groups such that each-group meets one specific clustering criteria, and thus the cluster containing the query object is the similarity search result. The experimental results show that the proposed approach is both effective and efficient for data supply chain retrieval.

[1]  G. Vachaud,et al.  Temporal Stability of Spatially Measured Soil Water Probability Density Function , 1985 .

[2]  Therese D. Pigott,et al.  A Review of Methods for Missing Data , 2001 .

[3]  E. Meijering A chronology of interpolation: from ancient astronomy to modern signal and image processing , 2002, Proc. IEEE.

[4]  Xindong Wu,et al.  Synthesizing High-Frequency Rules from Different Data Sources , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[6]  M. Rigolli,et al.  Driver behavioural classification from trajectory data , 2005, Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005..

[7]  Maohua Wang,et al.  Wireless sensors in agriculture and food industry — Recent development and future perspective , 2005 .

[8]  Shahar Dobzinski,et al.  Approximation algorithms for cas with complement - free bidders , 2005 .

[9]  J. Bouma,et al.  Soil water balance scenario studies using predicted soil hydraulic parameters , 2006 .

[10]  Svetha Venkatesh,et al.  Recognition of emergent human behaviour in a smart home: A data mining approach , 2007, Pervasive Mob. Comput..

[11]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[12]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[13]  Yu-Fang Chung,et al.  Bidder-anonymous English auction scheme with privacy and public verifiability , 2008, J. Syst. Softw..

[14]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[15]  Stefan Wrobel,et al.  Toolkit-Based High-Performance Data Mining of Large Data on MapReduce Clusters , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[16]  Gilles Virone Assessing everyday life behavioral rhythms for the older generation , 2009, Pervasive Mob. Comput..

[17]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[18]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[19]  Parag Agrawal,et al.  The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[20]  Zheng Shao,et al.  Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[21]  Peter J. Haas,et al.  Ricardo: integrating R and Hadoop , 2010, SIGMOD Conference.

[22]  John R. Wagner,et al.  Analysis of in-vehicle driver behaviour data for improved safety , 2011 .

[23]  J. Jokela,et al.  An automated platform for phytoplankton ecology and aquatic ecosystem monitoring. , 2011, Environmental science & technology.

[24]  Zongpeng Li,et al.  Strategyproof auctions for balancing social welfare and fairness in secondary spectrum markets , 2011, 2011 Proceedings IEEE INFOCOM.

[25]  Bin Xu,et al.  DRAG: A Priority-Guaranteed Routing for Sensor Networks with Low Duty Cycles , 2011, Ad Hoc Sens. Wirel. Networks.

[26]  Jian Lu,et al.  Recognizing multi-user activities using wearable sensors in a smart home , 2011, Pervasive Mob. Comput..

[27]  Mo Dong,et al.  Combinatorial auction with time-frequency flexibility in cognitive radio networks , 2012, 2012 Proceedings IEEE INFOCOM.

[28]  M. Amer,et al.  Nearest-Neighbor and Clustering based Anomaly Detection Algorithms for RapidMiner , 2012 .

[29]  Yi Liu,et al.  A three-dimensional gap filling method for large geophysical datasets: Application to global satellite soil moisture observations , 2012, Environ. Model. Softw..

[30]  Mingyan Liu,et al.  Mining Spectrum Usage Data: A Large-Scale Spectrum Measurement Study , 2009, IEEE Transactions on Mobile Computing.

[31]  Amutha Prabakar Muniyandi,et al.  Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree algorithm , 2012 .

[32]  Annemarie Schneider,et al.  Monitoring land cover change in urban and peri-urban areas using dense time stacks of Landsat satellite data and a data mining approach , 2012 .

[33]  Zhu Wang,et al.  From the internet of things to embedded intelligence , 2013, World Wide Web.

[34]  Shengrui Wang,et al.  ADR-SPLDA: Activity discovery and recognition by combining sequential patterns and latent Dirichlet allocation , 2012, Pervasive Mob. Comput..

[35]  Sungyoung Lee,et al.  Healthcare standards based sensory data exchange for Home Healthcare Monitoring System , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[36]  Fan Wu,et al.  SPRING: A Strategy-proof and Privacy preserving spectrum auction mechanism , 2013, 2013 Proceedings IEEE INFOCOM.

[37]  Diane J. Cook,et al.  The user side of sustainability: Modeling behavior and energy usage in the home , 2013, Pervasive Mob. Comput..

[38]  Xiang-Yang Li,et al.  Near-optimal truthful spectrum auction mechanisms with spatial and temporal reuse in wireless networks , 2013, MobiHoc.

[39]  K. K. Ramakrishnan,et al.  iDEAL: Incentivized Dynamic Cellular Offloading via Auctions , 2013, IEEE/ACM Transactions on Networking.

[40]  Verena Kantere,et al.  A Holistic Framework for Big Scientific Data Management , 2014, 2014 IEEE International Congress on Big Data.

[41]  Kevin B. Korb,et al.  Anomaly detection in vessel tracks using Bayesian networks , 2014, Int. J. Approx. Reason..

[42]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[43]  Carson Kai-Sang Leung,et al.  Reducing the Search Space for Big Data Mining for Interesting Patterns from Uncertain Data , 2014, 2014 IEEE International Congress on Big Data.

[44]  Víctor Peláez,et al.  An automatic data mining method to detect abnormal human behaviour using physical activity measurements , 2014, Pervasive Mob. Comput..

[45]  Mohd Saberi Mohamad,et al.  A Review on Missing Value Imputation Algorithms for Microarray Gene Expression Data , 2014 .

[46]  Ju Wang,et al.  Sensor data modeling and validating for wireless soil sensor network , 2015, Comput. Electron. Agric..

[47]  Hang-Bong Kang,et al.  Smartphone-based modeling and detection of aggressiveness reactions in senior drivers , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[48]  Narendra Singh Raghuwanshi,et al.  Wireless sensor networks for agriculture: The state-of-the-art in practice and future challenges , 2015, Comput. Electron. Agric..

[49]  Gang Hua,et al.  Multimedia Big Data Computing , 2015, IEEE Multim..