Multi-Layer Combinatorial Fusion Using Cognitive Diversity

Multiple scoring systems (including rank and score functions; MSS) have been widely used in multiple regression, intelligent biometric systems, multiple artificial neural nets, combining pattern classifiers, ensemble methods, machine learning and artificial intelligence (AI), data and information fusion, preference ranking, and deep learning. Combining MSS has achieved numerous successful results in a variety of domain applications. However, the reasons why this happens remains an active area of investigation. Combinatorial fusion analysis (CFA) combines MSS using the rank-score characteristic (RSC) function and cognitive diversity (CD). The RSC function was proposed to characterise the predictive behaviour of a scoring system. It was subsequently used to define the notion of “cognitive diversity”, which measures the dissimilarity in the representation of information between two scoring systems. In this article, we first examine characterizations of and diversity between scoring systems. Then, we review combinatorial fusion analysis with a variety of domain applications, including biometric systems in cognitive neuroscience, and joint decision making with visual cognitive systems. Finally, we demonstrate that multi-layer combinatorial fusion (MCF) on the Kemeny rank space is a viable machine learning and AI framework for preference ranking and reinforcement learning. This work provides a scientific foundation and technological insights for the use of Combinatorial Fusion in ensemble methods, data and information fusion, preference ranking, and deep reinforcement learning with applications to a variety of domains in data science and informatics for secure and sustainable societies.

[1]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[2]  Shengli Wu,et al.  Data Fusion in Information Retrieval , 2012, Adaptation, Learning, and Optimization.

[3]  R. Kempton,et al.  The Q-statistic and the diversity of floras , 1978, Nature.

[4]  Martin Jaggi,et al.  Model Fusion via Optimal Transport , 2019, NeurIPS.

[5]  N. Smirnov Table for Estimating the Goodness of Fit of Empirical Distributions , 1948 .

[6]  Yong Deng,et al.  Feature Selection and Combination for Stress Identification Using Correlation and Diversity , 2012, 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks.

[7]  D. Frank Hsu,et al.  On the combination of two visual cognition systems using combinatorial fusion , 2015, Brain Informatics.

[8]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[9]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[10]  Rainer Brüggemann,et al.  Multi-indicator systems and modelling in partial order , 2014 .

[11]  Chuan Yi Tang,et al.  On the Relationships Among Various Diversity Measures in Multiple Classifier Systems , 2008, 2008 International Symposium on Parallel Architectures, Algorithms, and Networks (i-span 2008).

[12]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[13]  Chuan Yi Tang,et al.  Feature Selection and Combination Criteria for Improving Accuracy in Protein Structure Prediction , 2007, IEEE Transactions on NanoBioscience.

[14]  D. Obradovic,et al.  Combining Artificial Neural Nets , 1999, Perspectives in Neural Computing.

[15]  D. Johnson,et al.  A difference. , 1990, Advancing clinical care : official journal of NOAADN.

[16]  A. Kolmogorov On the Empirical Determination of a Distribution Function , 1992 .

[17]  Tin Kam Ho,et al.  MULTIPLE CLASSIFIER COMBINATION: LESSONS AND NEXT STEPS , 2002 .

[18]  P. Willett,et al.  Combination of molecular similarity measures using data fusion , 2000 .

[19]  Marina L. Gavrilova,et al.  Multimodal Biometrics and Intelligent Image Processing for Security Systems , 2013 .

[20]  Soon Myoung Chung,et al.  Combination of Multiple Feature Selection Methods for Text Categorization by using Combinatorial Fusion Analysis and Rank-Score Characteristic , 2013, Int. J. Artif. Intell. Tools.

[21]  Milad Shokouhi,et al.  Segmentation of Search Engine Results for Effective Data-Fusion , 2007, ECIR.

[22]  D. Frank Hsu,et al.  Preference Prediction Based on Eye Movement Using Multi-layer Combinatorial Fusion , 2018, BI.

[23]  Liang Wang,et al.  Selection of fusion operations using rank-score diversity for robot mapping and localization , 2007, SPIE Defense + Commercial Sensing.

[24]  D.Frank Hsu,et al.  On Container Width and Length in Graphs, Groups,and Networks--Dedicated to Professor Paul Erdös on the occasion of his 80th birthday-- , 1994 .

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Derek Partridge,et al.  Software Diversity: Practical Statistics for Its Measurement and Exploitation | Draft Currently under Revision , 1996 .

[27]  Damian M. Lyons,et al.  Combining multiple scoring systems for target tracking using rank-score characteristics , 2009, Inf. Fusion.

[28]  Cheng-Yan Kao,et al.  Combination methods in microarray analysis , 2004, 7th International Symposium on Parallel Architectures, Algorithms and Networks, 2004. Proceedings..

[29]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[30]  S. Starr THERMODYNAMIC LIMIT FOR THE MALLOWS MODEL ON Sn , 2009, 0904.0696.

[31]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[32]  C. Spearman ‘FOOTRULE’ FOR MEASURING CORRELATION , 1906 .

[33]  D. Frank Hsu,et al.  Combinatorial fusion with on-line learning algorithms , 2008, 2008 11th International Conference on Information Fusion.

[34]  D. Frank Hsu,et al.  ChIP-Seq Analytics: Methods and Systems to Improve ChIP-Seq Peak Identification , 2012 .

[35]  Didier Rognan,et al.  Protein‐based virtual screening of chemical databases. II. Are homology models of g‐protein coupled receptors suitable targets? , 2002, Proteins.

[36]  D. Frank Hsu,et al.  Cognitive Diversity: A Measurement of Dissimilarity Between Multiple Scoring Systems , 2019, J. Interconnect. Networks.

[37]  Willem J. Heiser,et al.  Clustering and Prediction of Rankings Within a Kemeny Distance Framework , 2013, Algorithms from and for Nature and Life.

[38]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[39]  P. Palmieri A phenomenology of Galileo's experiments with pendulums , 2009 .

[40]  Joel Friedman,et al.  On Cayley Graphs on the Symmetric Group Generated by Tranpositions , 2000, Comb..

[41]  Hongfang Liu,et al.  Identifying significant genes from microarray data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[42]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[43]  D. Frank Hsu,et al.  Combinatorial Fusion Analysis: Methods and Practices of Combining Multiple Scoring Systems , 2006 .

[44]  J. Kemeny Generalized random variables , 1959 .

[45]  Marina L. Gavrilova,et al.  Multimodal Biometric System Using Rank-Level Fusion Approach , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Jeffrey M. Voas,et al.  Cybersecurity: Toward a Secure and Sustainable Cyber Ecosystem , 2015, Computer.

[47]  D. Frank Hsu,et al.  Detecting preferences based on eye movement using combinatorial fusion , 2016, 2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[48]  D. Frank Hsu,et al.  Fusion analysis of information retrieval models on biomedical collections , 2011, 14th International Conference on Information Fusion.

[49]  C. Tang,et al.  Identification of degenerate motifs using position restricted selection and hybrid ranking combination , 2006, Nucleic acids research.

[50]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[51]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[52]  G. Yule On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .

[53]  D. Frank Hsu,et al.  Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval , 2005, Information Retrieval.

[54]  Wenpin Tang,et al.  Mallows ranking models: maximum likelihood estimate and regeneration , 2018, ICML.

[55]  D. Frank Hsu,et al.  Improving Portfolio Performance Using Attribute Selection and Combination , 2019, I-SPAN.

[56]  H. Vinod,et al.  Combining Multiple Criterion Systems for Improving Portfolio Performance , 2008 .

[57]  Hui-Huang Hsu,et al.  Advanced Data Mining Technologies in Bioinformatics , 2006 .

[58]  Olac Fuentes,et al.  DLSCORE: A Deep Learning Model for Predicting Protein-Ligand Binding Affinities , 2018 .

[59]  Georgios N. Yannakakis,et al.  Don’t Classify Ratings of Affect; Rank Them! , 2014, IEEE Transactions on Affective Computing.

[60]  Ganapati P. Patil,et al.  Ranking and Prioritization with Partial Order for Multi-indicator Systems – An Integrative View with a Look Forward , 2011 .

[61]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[62]  D. Frank Hsu,et al.  Microarray Gene Expression Analysis Using Combinatorial Fusion , 2009, 2009 Ninth IEEE International Conference on Bioinformatics and BioEngineering.

[63]  E. J. Emond,et al.  A new rank correlation coefficient with application to the consensus ranking problem , 2002 .

[64]  Georg Martius,et al.  Differentiation of Blackbox Combinatorial Solvers , 2020, ICLR.

[65]  Michael A. Nielsen AN INTRODUCTION TO EXPANDER GRAPHS , 2005 .

[66]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[67]  Mélanie Frappier,et al.  The Book of Why: The New Science of Cause and Effect , 2018, Science.

[68]  Chuan Yi Tang,et al.  On the Diversity-Performance Relationship for Majority Voting in Classifier Ensembles , 2007, MCS.

[69]  Cheng Soon Ong,et al.  Multivariate spearman's ρ for aggregating ranks using copulas , 2016 .

[70]  M. Kendall The treatment of ties in ranking problems. , 1945, Biometrika.

[71]  Jin Tian,et al.  Graphical Models for Inference with Missing Data , 2013, NIPS.

[72]  Ofer Melnik,et al.  Mixed group ranks: preference and confidence in classifier combination , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  D. Frank Hsu,et al.  Rank-Score Characteristics (RSC) Function and Cognitive Diversity , 2010, Brain Informatics.

[74]  A. Martin-Löf On the composition of elementary errors , 1994 .

[75]  Cheng-Kuan Lin,et al.  The construction of mutually independent Hamiltonian cycles in bubble-sort graphs , 2010, Int. J. Comput. Math..

[76]  D. Frank Hsu,et al.  The diversity rank-score function for combining human visual perception systems , 2016, Brain Informatics.

[77]  D. Frank Hsu,et al.  Consensus Scoring Criteria for Improving Enrichment in Virtual Screening , 2005, J. Chem. Inf. Model..

[78]  D. Frank Hsu,et al.  Performance evaluation of classifier ensembles in terms of diversity and performance of individual systems , 2010, Int. J. Pervasive Comput. Commun..

[79]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[80]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[81]  Anthony J. G. Hey,et al.  Jim Gray on eScience: a transformed scientific method , 2009, The Fourth Paradigm.

[82]  Paul Erdös,et al.  Distributed Loop Network with Minimum Transmission Delay , 1992, Theor. Comput. Sci..