Learning Feature Weights from Positive Cases

The availability of new data sources presents both opportunities and challenges for the use of Case-based Reasoning to solve novel problems. In this paper, we describe the research challenges we faced when trying to reuse experiences of successful academic collaborations available online in descriptions of funded grant proposals. The goal is to recommend the characteristics of two collaborators to complement an academic seeking a multidisciplinary team; the three form a collaboration that resembles a configuration that has been successful in securing funding. While seeking a suitable measure for computing similarity between cases, we were confronted with two challenges: a problem context with insufficient domain knowledge and data that consists exclusively of successful collaborations, that is, it contains only positive instances. We present our strategy to overcome these challenges, which is a clustering-based approach to learn feature weights. Our approach identifies poorly aligned cases, i.e., ones that violate the assumption that similar problems have similar solutions. We use the poorly aligned cases as negatives in a feedback algorithm to learn feature weights. The result of this work is an integration of methods that makes CBR useful to yet another context and in conditions it has not been used before.

[1]  Rosina O. Weber,et al.  Applying CBR Principles to Reason without Negative Exemplars , 2013, FLAIRS.

[2]  Barry Smyth,et al.  Footprint-Based Retrieval , 1999, ICCBR.

[3]  Luc Lamontagne,et al.  Case-Based Reasoning Research and Development , 1997, Lecture Notes in Computer Science.

[4]  Philip S. Yu,et al.  Fast algorithms for projected clustering , 1999, SIGMOD '99.

[5]  Rosina O. Weber,et al.  Blueprints for Success - Guidelines for Building Multidisciplinary Collaboration Teams , 2012, ICAART.

[6]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7]  Philip S. Yu,et al.  Building text classifiers using positive and unlabeled examples , 2003, Third IEEE International Conference on Data Mining.

[8]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[9]  Sutanu Chakraborti,et al.  Visualizing and Evaluating Complexity of Textual Case Bases , 2008, ECCBR.

[10]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[11]  Stewart Massie,et al.  From Anomaly Reports to Cases , 2007, ICCBR.

[12]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[13]  Stewart Massie,et al.  When Similar Problems Don't Have Similar Solutions , 2007, ICCBR.

[14]  Ira Assent,et al.  Evaluating Clustering in Subspace Projections of High Dimensional Data , 2009, Proc. VLDB Endow..

[15]  Derek G. Bridge,et al.  A Case-Based Solution to the Cold-Start Problem in Group Recommenders , 2012, ICCBR.

[16]  Sarah Jane Delany The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing , 2009, ICCBR.

[17]  Pedro Larrañaga,et al.  A partially supervised classification approach to dominant and recessive human disease gene prediction , 2007, Comput. Methods Programs Biomed..

[18]  David W. Aha,et al.  Feature Weighting for Lazy Learning Algorithms , 1998 .

[19]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[20]  Jiawei Han,et al.  PEBL: Web page classification without negative examples , 2004, IEEE Transactions on Knowledge and Data Engineering.

[21]  Xi-feng Zhou,et al.  Reexamination of CBR Hypothesis , 2010, ICCBR.

[22]  Michael M. Richter,et al.  Case-Based Reasoning , 2013, Springer Berlin Heidelberg.

[23]  David Leake,et al.  Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[24]  Philip S. Yu Editorial: State of the Transactions , 2004, IEEE Trans. Knowl. Data Eng..

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Michael M. Richter,et al.  Case-Based Reasoning: A Textbook , 2013 .

[27]  Enric Plaza,et al.  Semantics and Experience in the Future Web , 2008, ECCBR.

[28]  Luc Lamontagne,et al.  Textual CBR Authoring using Case Cohesion , 2006 .

[29]  Robin D. Burke,et al.  Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[30]  Rosina O. Weber,et al.  Finding That Special Someone: Interdisciplinary Collaboration in an Academic Context , 2010 .

[31]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.