Recognizing Information Feature Variation: Message Importance Transfer Measure and Its Applications in Big Data

Information transfer that characterizes the information feature variation can have a crucial impact on big data analytics and processing. Actually, the measure for information transfer can reflect the system change from the statistics by using the variable distributions, similar to Kullback-Leibler (KL) divergence and Renyi divergence. Furthermore, to some degree, small probability events may carry the most important part of the total message in an information transfer of big data. Therefore, it is significant to propose an information transfer measure with respect to the message importance from the viewpoint of small probability events. In this paper, we present the message importance transfer measure (MITM) and analyze its performance and applications in three aspects. First, we discuss the robustness of MITM by using it to measuring information distance. Then, we present a message importance transfer capacity by resorting to the MITM and give an upper bound for the information transfer process with disturbance. Finally, we apply the MITM to discuss the queue length selection, which is the fundamental problem of caching operation on mobile edge computing.

[1]  Jingrui He,et al.  Graph-Based Rare Category Detection , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[2]  Yongfeng Huang,et al.  Optimization of CNN through Novel Training Strategy for Visual Classification Problems , 2018, Entropy.

[3]  Christos K. Kourtellaris,et al.  Information Transfer of Control Strategies: Dualities of Stochastic Optimal Control Theory and Feedback Capacity of Information Theory , 2017, IEEE Transactions on Automatic Control.

[4]  Umesh Vaidya,et al.  Causality preserving information transfer measure for control dynamical system , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[5]  Harald Haas,et al.  Kullback-Leibler Divergence (KLD) Based Anomaly Detection and Monotonic Sequence Analysis , 2011, 2011 IEEE Vehicular Technology Conference (VTC Fall).

[6]  Shiguo Wang,et al.  A Comprehensive Survey of Data Mining-Based Accounting-Fraud Detection Research , 2010, 2010 International Conference on Intelligent Computation Technology and Automation.

[7]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[8]  Richard Kleeman,et al.  Information transfer between dynamical system components. , 2005, Physical review letters.

[9]  Pingyi Fan,et al.  Non-Parametric Message Importance Measure: Storage Code Design and Transmission Planning for Big Data , 2017, IEEE Transactions on Communications.

[10]  Li Guo,et al.  KL Divergence-Based Fuzzy Cluster Ensemble for Image Segmentation , 2018, Entropy.

[11]  Marc Dacier,et al.  Mining intrusion detection alarms for actionable knowledge , 2002, KDD.

[12]  Pingyi Fan,et al.  Amplifying Inter-Message Distance: On Information Divergence Measures in Big Data , 2017, IEEE Access.

[13]  Pingyi Fan,et al.  Focusing on a probability element: Parameter selection of message importance measure in big data , 2017, 2017 IEEE International Conference on Communications (ICC).

[14]  Shao-Lun Huang,et al.  An information-theoretic approach to universal feature selection in high-dimensional inference , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[15]  Tapani Ristaniemi,et al.  Multi-objective optimization for computation offloading in mobile-edge computing , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[16]  Pingyi Fan,et al.  Message Importance Measure and Its Application to Minority Subset Detection in Big Data , 2016, 2016 IEEE Globecom Workshops (GC Wkshps).

[17]  Fei-Fei Li,et al.  Exploring Functional Connectivities of the Human Brain using Multivariate Information Analysis , 2009, NIPS.

[18]  Umesh Vaidya,et al.  Formalism for information transfer in dynamical network , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[19]  Xue-wen Chen,et al.  Big Data Deep Learning: Challenges and Perspectives , 2014, IEEE Access.

[20]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[21]  Choong Seon Hong,et al.  Collaborative cache allocation and computation offloading in mobile edge computing , 2017, 2017 19th Asia-Pacific Network Operations and Management Symposium (APNOMS).

[22]  Haim H. Permuter,et al.  Universal Estimation of Directed Information , 2010, IEEE Transactions on Information Theory.

[23]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[24]  Jingrui He,et al.  Rare Category Detection on Time-Evolving Graphs , 2015, 2015 IEEE International Conference on Data Mining.

[25]  Fang Liu,et al.  A Feature Extraction Method Using Improved Multi-Scale Entropy for Rolling Bearing Fault Diagnosis , 2018, Entropy.

[26]  Shin Ando,et al.  Clustering Needles in a Haystack: An Information Theoretic Analysis of Minority and Outlier Detection , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[27]  G. Crooks On Measures of Entropy and Information , 2015 .

[28]  Gerhard Kramer,et al.  Directed information for channels with feedback , 1998 .

[29]  Alfred O. Hero,et al.  On Local Intrinsic Dimension Estimation and Its Applications , 2010, IEEE Transactions on Signal Processing.

[30]  Aleksandra Zięba,et al.  Counterterrorism Systems of Spain and Poland: Comparative Studies , 2015 .

[31]  Einoshin Suzuki,et al.  An Information Theoretic Approach to Detection of Minority Subsets in Database , 2006, Sixth International Conference on Data Mining (ICDM'06).

[32]  J. Massey CAUSALITY, FEEDBACK AND DIRECTED INFORMATION , 1990 .

[33]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.