Convex clustering method for compositional data modeling

Compositional data refer to a vector with parts that are positive and subject to a constant-sum constraint. Examples of compositional data in the real world include a vector with each entry representing the weight of a stock in an investment portfolio, or the relative concentration of air pollutants in the environment. In this study, we developed a Convex Clustering approach for grouping Compositional data. Convex clustering is desirable because it provides a global optimal solution given its convex relaxations of hierarchical clustering. However, when directly applied to compositions, the clustering result offers little interpretability because it ignores the unit-sum constraint of compositional data. In this study, we discuss the clustering of compositional variables in the Aitchison framework with an isometric log-ratio (ilr) transformation. The objective optimization function is formulated as a combination of a $$L_2$$ -norm loss term and a $$L_1$$ -norm regularization term and is then efficiently solved using the alternating direction method of multipliers. Based on the numerical simulation results, the accuracy of clustering ilr-transformed data is higher than the accuracy of directly clustering untransformed compositional data. To demonstrate its practical use in real applications, the proposed method is also tested on several real-world datasets.

[1]  Songcan Chen,et al.  Robust convex clustering , 2019, Soft Computing.

[2]  Yu-ya Cui,et al.  A New Algorithm of the Best Path Selection Based on Machine Learning , 2019, IEEE Access.

[3]  G. Mateu-Figueras,et al.  Isometric Logratio Transformations for Compositional Data Analysis , 2003 .

[4]  K. Gerald van den Boogaart,et al.  Descriptive Analysis of Compositional Data , 2013 .

[5]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[6]  Xiao-huan Liu,et al.  Dynamic Analysis for the Average Shortest Path Length of Mobile Ad Hoc Networks Under Random Failure Scenarios , 2019, IEEE Access.

[7]  Xiao-dan Zhang,et al.  Design and implementation of embedded un-interruptible power supply system (EUPSS) for web-based mobile application , 2012, Enterp. Inf. Syst..

[8]  Peter Filzmoser,et al.  Robust and sparse k-means clustering for high-dimensional data , 2017, Advances in Data Analysis and Classification.

[9]  Guang Li,et al.  An Energy-Balanced Routing Method Based on Forward-Aware Factor for Wireless Sensor Networks , 2014, IEEE Transactions on Industrial Informatics.

[10]  Weifa Liang,et al.  A Unified Spatio-Temporal Model for Short-Term Traffic Flow Prediction , 2019, IEEE Transactions on Intelligent Transportation Systems.

[11]  Wei Sun,et al.  Sparse Convex Clustering , 2016, ArXiv.

[12]  Francis R. Bach,et al.  Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties , 2011, ICML.

[13]  De-gan Zhang,et al.  Novel approach of distributed & adaptive trust metrics for MANET , 2019, Wirel. Networks.

[14]  Xiang Wang,et al.  A new clustering routing method based on PECE for WSN , 2015, EURASIP Journal on Wireless Communications and Networking.

[15]  Josep-Antoni Martín-Fernández,et al.  Dealing with Distances and Transformations for Fuzzy C-Means Clustering of Compositional Data , 2012, J. Classif..

[16]  Hosik Choi,et al.  Convex clustering analysis for histogram‐valued data , 2019, Biometrics.

[17]  Junhui Wang,et al.  Selection of the number of clusters via the bootstrap method , 2012, Comput. Stat. Data Anal..

[18]  Ting Zhang,et al.  Novel method of mobile edge computation offloading based on evolutionary game strategy for IoT devices , 2020 .

[19]  Lu Chen,et al.  A kind of novel RSAR protocol for mobile vehicular Ad hoc network , 2019, CCF Trans. Netw..

[20]  Wei Fu,et al.  Estimating the Number of Clusters Using Cross-Validation , 2017, Journal of Computational and Graphical Statistics.

[21]  J. Aitchison On criteria for measures of compositional difference , 1992 .

[22]  Chen Chen,et al.  New algorithm of multi-strategy channel allocation for edge computing , 2020 .

[23]  Yu-ya Cui,et al.  New approach of multi-path reliable transmission for marginal wireless sensor network , 2020, Wirel. Networks.

[24]  Si Liu,et al.  Novel PEECR-based clustering routing approach , 2017, Soft Comput..

[25]  Pei-Chann Chang,et al.  A cluster validity evaluation method for dynamically determining the near-optimal number of clusters , 2019, Soft Computing.

[26]  Martin Boldt,et al.  Detecting serial residential burglaries using clustering , 2014, Expert Syst. Appl..

[27]  Xiaokang Wang,et al.  A density weighted fuzzy outlier clustering approach for class imbalanced learning , 2020, Neural Computing and Applications.

[28]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[29]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[30]  Le Thi Hoai An,et al.  An Improvement of Stability Based Method to Clustering , 2015, ICCSAMA.

[31]  Ting Zhang,et al.  Novel self-adaptive routing service algorithm for application in VANET , 2018, Applied Intelligence.

[32]  Changle Li,et al.  A Topological Approach to Secure Message Dissemination in Vehicular Networks , 2018, IEEE Transactions on Intelligent Transportation Systems.

[33]  Zhenfeng He,et al.  Evolutionary K-Means with pair-wise constraints , 2016, Soft Comput..

[34]  V. Pawlowsky-Glahn,et al.  Modelling and Analysis of Compositional Data: Pawlowsky-Glahn/Modelling and Analysis of Compositional Data , 2015 .

[35]  De-gan Zhang,et al.  New Medical Image Fusion Approach with Coding Based on SCD in Wireless Sensor Network , 2015 .

[36]  Hosik Choi,et al.  Convex clustering for binary data , 2018, Adv. Data Anal. Classif..

[37]  Xiaodan Zhang,et al.  A Kind of Novel Method of Power Allocation With Limited Cross-Tier Interference for CRN , 2019, IEEE Access.

[38]  Hao Wu,et al.  A New Method of Mobile Ad Hoc Network Routing Based on Greed Forwarding Improvement Strategy , 2019, IEEE Access.

[39]  Ting Zhang,et al.  A kind of new method of intelligent trust engineering metrics (ITEM) for application of mobile ad hoc network , 2019 .

[40]  Yue Dong,et al.  Novel optimized link state routing protocol based on quantum genetic strategy for mobile learning , 2018, J. Netw. Comput. Appl..

[41]  Jie Chen,et al.  A Multi-Path Routing Protocol Based on Link Lifetime and Energy Consumption Prediction for Mobile Edge Computing , 2020, IEEE Access.

[42]  Xenophon Papademetris,et al.  Groupwise whole-brain parcellation from resting-state fMRI data for network node identification , 2013, NeuroImage.

[43]  Ting Zhang,et al.  Novel unequal clustering routing protocol considering energy balancing based on network partition & distance for mobile education , 2017, J. Netw. Comput. Appl..

[44]  Hao Wu,et al.  Adaptive repair algorithm for TORA routing protocol based on flood control strategy , 2020, Comput. Commun..

[45]  Ming Ding,et al.  Optimal Base Station Antenna Downtilt in Downlink Cellular Networks , 2018, IEEE Transactions on Wireless Communications.

[46]  M. Templ,et al.  General approach to coordinate representation of compositional tables , 2018 .

[47]  Guoqiang Mao,et al.  New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks , 2019, IEEE Transactions on Intelligent Transportation Systems.

[48]  Eric C. Chi,et al.  Splitting Methods for Convex Clustering , 2013, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[49]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[50]  Tie Liu,et al.  Convex clustering with metric learning , 2018, Pattern Recognit..

[51]  M. C. Jones,et al.  The Statistical Analysis of Compositional Data , 1986 .

[52]  Chen Chen,et al.  New Method of Energy Efficient Subcarrier Allocation Based on Evolutionary Game Theory , 2018, Mob. Networks Appl..

[53]  Weifa Liang,et al.  Capacity of Cooperative Vehicular Networks With Infrastructure Support: Multiuser Case , 2016, IEEE Transactions on Vehicular Technology.