The Similarity Between Dissimilarities

When characterizing teams of people, molecules, or general graphs, it is difficult to encode all information using a single feature vector only. For these objects dissimilarity matrices that do capture the interaction or similarity between the sub-elements (people, atoms, nodes), can be used. This paper compares several representations of dissimilarity matrices, that encode the cluster characteristics, latent dimensionality, or outliers of these matrices. It appears that both the simple eigenvalue spectrum, or histogram of distances are already quite effective, and are able to reach high classification performances in multiple instance learning (MIL) problems. Finally, an analysis on teams of people is given, illustrating the potential use of dissimilarity matrix characterization for business consultancy.

[1]  Yannis Manolopoulos,et al.  Structure-based similarity search with graph histograms , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[2]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[3]  Thomas Gärtner,et al.  Multi-Instance Kernels , 2002, ICML.

[4]  K. Taylor,et al.  Physicians' perspective on quality of life: An exploratory study of oncologists , 1996, Quality of Life Research.

[5]  T. Ahrens,et al.  Doing Qualitative Field Research in Management Accounting: Positioning Data to Contribute to Theory , 2005 .

[6]  Thomas Gärtner Predictive Graph Mining with Kernel Methods , 2005 .

[7]  I. Plewis,et al.  What Works and Why: Combining Quantitative and Qualitative Approaches in Large‐scale Evaluations , 2005 .

[8]  Yixin Chen,et al.  MILES: Multiple-Instance Learning via Embedded Instance Selection , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  L. Hubert,et al.  9. Anti-Robinson Matrices for Symmetric Proximity Data , 2006 .

[10]  M. Edelen,et al.  Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement , 2007, Quality of Life Research.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Hongbin Zha,et al.  Adaptive p-posterior mixture-model kernels for multiple instance learning , 2008, ICML '08.

[13]  Jean-Paul Fox,et al.  Using Item Response Theory to Measure Extreme Response Style in Marketing Research: A Global Investigation , 2008 .

[14]  Kaspar Riesen,et al.  Efficient Suboptimal Graph Isomorphism , 2009, GbRPR.

[15]  I. Diamond,et al.  Proportional hazards models for current status data: Application to the study of differentials in age at weaning in Pakistan , 1986, Demography.

[16]  Kaspar Riesen,et al.  Recent advances in graph-based pattern recognition with applications in document analysis , 2011, Pattern Recognit..

[17]  Wan-Jui Lee,et al.  Bag Dissimilarities for Multiple Instance Learning , 2011, SIMBAD.

[18]  Wan-Jui Lee,et al.  Bridging Structure and Feature Representations in Graph Matching , 2012, Int. J. Pattern Recognit. Artif. Intell..

[19]  Markus Schedl,et al.  Local and global scaling reduce hubs in space , 2012, J. Mach. Learn. Res..

[20]  Kateryna Piterenko Business and Impact Alignment of Questionnaire , 2013 .

[21]  Marleen de Bruijne,et al.  Scalable kernels for graphs with continuous attributes , 2013, NIPS.

[22]  Lucas Hopkins,et al.  Looking Forward: The Role of Multiple Regression in Family Business Research , 2014 .

[23]  Mohammad H. Poursaeidi,et al.  Robust support vector machines for multiple instance learning , 2012, Annals of Operations Research.

[24]  Arthur Flexer,et al.  Choosing the Metric in High-Dimensional Spaces Based on Hub Analysis , 2014, ESANN.

[25]  Nikos Karacapilidis,et al.  Requirements for Big Data Analytics Supporting Decision Making: A Sensemaking Perspective , 2014 .

[26]  K. Roulston,et al.  Reconceptualizing Bias in Teaching Qualitative Research Methods , 2015 .

[27]  Marco Loog,et al.  Multiple instance learning with bag dissimilarities , 2013, Pattern Recognit..