The Effect of Oversampling and Undersampling on Classifying Imbalanced Text Datasets

PCT No. PCT/GB85/00507 Sec. 371 Date Aug. 7, 1986 Sec. 102(e) Date Aug. 7, 1986 PCT Filed Nov. 7, 1985 PCT Pub. No. WO86/02933 PCT Pub. Date May 22, 1986.A polyurethane reaction product of a di- or polyisocyanate and a diol or polyol having at least two hydroxyl groups capable of reacting with an isocyanate group and having the residue of at least one further hydroxyl group present in the form of a phosphorus acid ester group of formula (I), wherein n is 0 or 1, m is 2, 3, or 4 and each R independently is an alkyl group of 1 to 4 carbons optionally with a diol or polyol which does not contain a group of formula I is useful in biocompatible, particularly haemocompatible devices for use in methods of medical treatment.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[3]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[4]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[5]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[6]  Nathalie Japkowicz,et al.  Concept-Learning in the Presence of Between-Class and Within-Class Imbalances , 2001, Canadian Conference on AI.

[7]  Evangelos E. Milios,et al.  Using Unsupervised Learning to Guide Resampling in Imbalanced Data Sets , 2001, AISTATS.

[8]  JapkowiczNathalie,et al.  The class imbalance problem: A systematic study , 2002 .

[9]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[10]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[11]  R. Srihari,et al.  Optimally Combining Positive and Negative Features for Text Categorization , 2003 .

[12]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[13]  Marko Grobelnik,et al.  Training text classifiers with SVM on very few positive examples , 2003 .

[14]  Edward Y. Chang,et al.  Class-Boundary Alignment for Imbalanced Dataset Learning , 2003 .

[15]  Shi Zhong,et al.  A Comparative Study of Generative Models for Document Clustering , 2003 .

[16]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[17]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[18]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[19]  M. Dolores del Castillo,et al.  A multistrategy approach for digital text categorization from imbalanced documents , 2004, SKDD.

[20]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[21]  Rohini K. Srihari,et al.  Feature selection for text categorization on imbalanced data , 2004, SKDD.