Semi-supervised Learning
暂无分享,去创建一个
[1] Tomer Hertz,et al. Learning Distance Functions using Equivalence Relations , 2003, ICML.
[2] Haidong Wang,et al. Discovering molecular pathways from protein interaction and gene expression data , 2003, ISMB.
[3] Alexander Gammerman,et al. Learning by Transduction , 1998, UAI.
[4] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[5] Adam R. Klivans,et al. Learning intersections and thresholds of halfspaces , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..
[6] Ann B. Lee,et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.
[7] Raymond J. Mooney,et al. A probabilistic framework for semi-supervised clustering , 2004, KDD.
[8] Mehryar Mohri,et al. Rational Kernels , 2002, NIPS.
[9] Thomas L. Madden,et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.
[10] Susan T. Dumais,et al. Inductive learning algorithms and representations for text categorization , 1998, CIKM '98.
[11] G. Wahba. Smoothing noisy data with spline functions , 1975 .
[12] Nikhil Bansal,et al. Correlation Clustering , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..
[13] E. Kushilevitz,et al. Learning by distances , 1990, COLT '90.
[14] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[15] E. Nadaraya. On Estimating Regression , 1964 .
[16] Nello Cristianini,et al. Kernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast , 2003, Pacific Symposium on Biocomputing.
[17] Dale Schuurmans,et al. Metric-Based Methods for Adaptive Model Selection and Regularization , 2002, Machine Learning.
[18] Nello Cristianini,et al. Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..
[19] Thorsten Joachims,et al. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.
[20] Maria-Florina Balcan,et al. An Augmented PAC Model for Semi-Supervised Learning , 2006, Semi-Supervised Learning.
[21] Andrew W. Moore,et al. Fast Robust Logistic Regression for Large Sparse Datasets with Binary Outputs , 2003, AISTATS.
[22] Claire Cardie,et al. Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .
[23] S. Ganesalingam. Classification and Mixture Approaches to Clustering Via Maximum Likelihood , 1989 .
[24] Ran El-Yaniv,et al. Error Bounds for Transductive Learning via Compression and Clustering , 2003, NIPS.
[25] Maria-Florina Balcan,et al. Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.
[26] Li Liao,et al. Combining pairwise sequence similarity and support vector machines for remote protein homology detection , 2002, RECOMB '02.
[27] Avrim Blum,et al. Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.
[28] Adrian Corduneanu,et al. Distributed Information Regularization on Graphs , 2004, NIPS.
[29] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[30] Tom M. Mitchell,et al. Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..
[31] Inderjit S. Dhillon,et al. Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..
[32] Zoubin Ghahramani,et al. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.
[33] Matthias W. Seeger,et al. Covariance Kernels from Bayesian Generative Models , 2001, NIPS.
[34] Michael I. Jordan,et al. Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.
[35] G. S. Watson,et al. Smooth regression analysis , 1964 .
[36] L. Goldstein,et al. Optimal Plug-in Estimators for Nonparametric Functional Estimation , 1992 .
[37] Jerome H. Friedman,et al. On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.
[38] Inderjit S. Dhillon,et al. Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.
[39] Jon M. Kleinberg,et al. Detecting a network failure , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[40] D. M. Titterington,et al. Updating a Diagnostic System using Unconfirmed Cases , 1976 .
[41] M. Kearns. Efficient noise-tolerant learning from statistical queries , 1998, JACM.
[42] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[43] H. J. Scudder,et al. Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.
[44] John C. Platt. Fast Embedding of Sparse Similarity Graphs , 2003, NIPS.
[45] Raymond J. Mooney,et al. Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.
[46] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.
[47] Brian D. Ripley,et al. Pattern Recognition and Neural Networks , 1996 .
[48] Léon Bottou,et al. Local Learning Algorithms , 1992, Neural Computation.
[49] Yoshua Bengio,et al. Model Selection for Small Sample Regression , 2002, Machine Learning.
[50] M. Seeger. Input-dependent Regularization of Conditional Density Models , 2000 .
[51] Ayhan Demiriz,et al. Exploiting unlabeled data in ensemble methods , 2002, KDD.
[52] John Shawe-Taylor,et al. Structural Risk Minimization Over Data-Dependent Hierarchies , 1998, IEEE Trans. Inf. Theory.
[53] Andreas Stolcke,et al. Best-first Model Merging for Hidden Markov Model Induction , 1994, ArXiv.
[54] G. McLachlan. Discriminant Analysis and Statistical Pattern Recognition , 1992 .
[55] H. Zha,et al. Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..
[56] Vittorio Castelli,et al. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter , 1996, IEEE Trans. Inf. Theory.
[57] Nello Cristianini,et al. Convex Methods for Transduction , 2003, NIPS.
[58] Fabio Gagliardi Cozman,et al. Unlabeled Data Can Degrade Classification Performance of Generative Classifiers , 2002, FLAIRS.
[59] Thomas G. Dietterich. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.
[60] R. Shibata. An optimal selection of regression variables , 1981 .
[61] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.
[62] Christopher K. I. Williams. Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.
[63] Nello Cristianini,et al. A statistical framework for genomic data fusion , 2004, Bioinform..
[64] M. Yamasaki. Ideal boundary limit of discrete Dirichlet functions , 1986 .
[65] Alexander Gammerman,et al. Machine-Learning Applications of Algorithmic Randomness , 1999, ICML.
[66] Harris Drucker,et al. Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.
[67] Jason Weston,et al. Mismatch String Kernels for SVM Protein Classification , 2002, NIPS.
[68] Olivier Bousquet,et al. On the Complexity of Learning the Kernel Matrix , 2002, NIPS.
[69] Geoffrey E. Hinton,et al. Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.
[70] Nello Cristianini,et al. An introduction to Support Vector Machines , 2000 .
[71] Geoffrey C. Fox,et al. Vector quantization by deterministic annealing , 1992, IEEE Trans. Inf. Theory.
[72] Tatsuya Akutsu,et al. Protein homology detection using string alignment kernels , 2004, Bioinform..
[73] David Maxwell Chickering,et al. Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..
[74] Santosh S. Vempala,et al. A random sampling based algorithm for learning the intersection of half-spaces , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.
[75] Nicu Sebe,et al. Semisupervised learning of classifiers: theory, algorithms, and their application to human-computer interaction , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[76] Vladimir Koltchinskii,et al. Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.
[77] S. Rosenberg. The Laplacian on a Riemannian Manifold: The Laplacian on a Riemannian Manifold , 1997 .
[78] Ulrike von Luxburg,et al. Limits of Spectral Clustering , 2004, NIPS.
[79] David A. Landgrebe,et al. The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon , 1994, IEEE Trans. Geosci. Remote. Sens..
[80] Ting Chen,et al. An integrated probabilistic model for functional prediction of proteins , 2003, RECOMB '03.
[81] I. Jolliffe. Principal Component Analysis , 2002 .
[82] James R. Knight,et al. A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.
[83] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.
[84] Byron Dom,et al. An Information-Theoretic External Cluster-Validity Measure , 2002, UAI.
[85] Andrew B. Kahng,et al. New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[86] N. Cristianini,et al. On Kernel-Target Alignment , 2001, NIPS.
[87] T. Takagi,et al. Assessment of prediction accuracy of protein function from protein–protein interaction data , 2001, Yeast.
[88] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[89] Mikhail Bilenko and Sugato Basu. A Comparison of Inference Techniques for Semi-supervised Clustering with Hidden Markov Random Fields , 2004 .
[90] H. White. Maximum Likelihood Estimation of Misspecified Models , 1982 .
[91] Jason Weston,et al. Vicinal Risk Minimization , 2000, NIPS.
[92] R. Berk,et al. Limiting Behavior of Posterior Distributions when the Model is Incorrect , 1966 .
[93] Andrew McCallum,et al. Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.
[94] Bernhard Schölkopf,et al. Support vector channel selection in BCI , 2004, IEEE Transactions on Biomedical Engineering.
[95] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.
[96] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.
[97] Christopher J. C. Burges,et al. Geometric Methods for Feature Extraction and Dimensional Reduction , 2005 .
[98] Shang-Hua Teng,et al. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.
[99] J. J. Rocchio,et al. Relevance feedback in information retrieval , 1971 .
[100] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.
[101] Nir Friedman,et al. The Bayesian Structural EM Algorithm , 1998, UAI.
[102] Dale Schuurmans,et al. Characterizing the generalization performance of model selection strategies , 1997, ICML.
[103] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.
[104] Tomer Hertz,et al. Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.
[105] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[106] David D. Lewis,et al. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.
[107] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[108] D.C. St. Clair,et al. SeMi-supervised adaptive resonance theory (SMART2) , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[109] E. Myers,et al. Basic local alignment search tool. , 1990, Journal of molecular biology.
[110] A. Agresti,et al. Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.
[111] J. Besag. On the Statistical Analysis of Dirty Pictures , 1986 .
[112] Alon Orlitsky,et al. Estimating and computing density based distance metrics , 2005, ICML.
[113] Thorsten Joachims,et al. Transductive Learning via Spectral Graph Partitioning , 2003, ICML.
[114] T Poggio,et al. Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.
[115] Terence J. O'Neill. Normal Discrimination with Unclassified Observations , 1978 .
[116] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[117] Raymond A. Board,et al. Semi-Supervised Learning , 1989, Machine Learning.
[118] Thomas Hofmann,et al. Statistical Models for Co-occurrence Data , 1998 .
[119] Shaoning Pang,et al. Transductive support vector machines and applications in bioinformatics for promoter recognition , 2003, International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003.
[120] Nicolas Chapados,et al. Extensions to Metric-Based Model Selection , 2003, J. Mach. Learn. Res..
[121] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[122] Karsten A. Verbeurgt. Learning DNF under the uniform distribution in quasi-polynomial time , 1990, COLT '90.
[123] H. Akaike. A new look at the statistical model identification , 1974 .
[124] D. Titterington,et al. Estimation Problems with Data from a Mixture , 1978 .
[125] Dan Klein,et al. From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.
[126] G. McLachlan,et al. The efficiency of a linear discriminant function based on unclassified initial samples , 1978 .
[127] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[128] John Langford,et al. Cover trees for nearest neighbor , 2006, ICML.
[129] Santosh S. Venkatesh,et al. Learning from a mixture of labeled and unlabeled examples with parametric side information , 1995, COLT '95.
[130] R. Redner,et al. Mixture densities, maximum likelihood, and the EM algorithm , 1984 .
[131] Tobias Scheffer,et al. Using Transduction and Multi-view Learning to Answer Emails , 2003, PKDD.
[132] Alan L. Yuille,et al. Statistical Physics, Mixtures of Distributions, and the EM Algorithm , 1994, Neural Computation.
[133] D. Donoho,et al. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.
[134] G. McLachlan. Iterative Reclassification Procedure for Constructing An Asymptotically Optimal Rule of Allocation in Discriminant-Analysis , 1975 .
[135] Adrian Corduneanu,et al. Continuation Methods for Mixing Heterogenous Sources , 2002, UAI.
[136] D. Haussler,et al. Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.
[137] Yaniv Ziv,et al. Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.
[138] Naftali Tishby,et al. Distributional Clustering of English Words , 1993, ACL.
[139] A G Murzin,et al. SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.
[140] Nicolas Le Roux,et al. Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.
[141] Paul A. Viola,et al. Unsupervised improvement of visual detectors using cotraining , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[142] Noam Nisan,et al. Constant depth circuits, Fourier transform, and learnability , 1989, 30th Annual Symposium on Foundations of Computer Science.
[143] Alexander J. Smola,et al. Kernels and Regularization on Graphs , 2003, COLT.
[144] Douglas L. Brutlag,et al. Remote homology detection: a motif based approach , 2003, ISMB.
[145] Sebastian Thrun,et al. Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.
[146] Joachim M. Buhmann,et al. Learning with constrained and unlabelled data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[147] G. McLachlan,et al. Updating a discriminant function in basis of unclassified data , 1982 .
[148] Ke Wang,et al. Profile-based string kernels for remote homology detection and motif extraction. , 2005, Journal of bioinformatics and computational biology.
[149] P. J. Huber. The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .
[150] Nicola J. Rinaldi,et al. Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.
[151] Aleksandrs Slivkins,et al. Network failure detection and graph connectivity , 2004, SODA '04.
[152] Bernhard Schölkopf,et al. Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.
[153] S. Sathiya Keerthi,et al. A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..
[154] Franck Davoine,et al. Expressive face recognition and synthesis , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.
[155] Peter G. Doyle,et al. Random Walks and Electric Networks: REFERENCES , 1987 .
[156] B. Efron. The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .
[157] Lawrence K. Saul,et al. Analysis and extension of spectral methods for nonlinear dimensionality reduction , 2005, ICML.
[158] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[159] Pedro M. Domingos,et al. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.
[160] Alexander J. Smola,et al. Fast Kernels for String and Tree Matching , 2002, NIPS.
[161] Michael Gribskov,et al. Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..
[162] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.
[163] Athanasios Papoulis,et al. Probability, Random Variables and Stochastic Processes , 1965 .
[164] Christina S. Leslie,et al. Fast Kernels for Inexact String Matching , 2003, COLT.
[165] Vladimir Vapnik,et al. Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics) , 1982 .
[166] Nello Cristianini,et al. Kernel methods for exploratory data analysis: a demonstration on text data , 2004 .
[167] Lawrence Carin,et al. Semi-Supervised Classification , 2004, Encyclopedia of Database Systems.
[168] Shailesh V. Date,et al. A Probabilistic Functional Network of Yeast Genes , 2004, Science.
[169] Nicolas Le Roux,et al. Efficient Non-Parametric Function Induction in Semi-Supervised Learning , 2004, AISTATS.
[170] M. F. Porter,et al. An algorithm for suffix stripping , 1997 .
[171] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.
[172] Matthias Hein,et al. Intrinsic dimensionality estimation of submanifolds in Rd , 2005, ICML.
[173] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[174] B. Efron. Computers and the Theory of Statistics: Thinking the Unthinkable , 1979 .
[175] Adrian Corduneanu,et al. On Information Regularization , 2002, UAI.
[176] Matthias Hein,et al. Measure Based Regularization , 2003, NIPS.
[177] Prasad Tadepalli,et al. Active Learning with Committees for Text Categorization , 1997, AAAI/IAAI.
[178] Ayhan Demiriz,et al. Semi-Supervised Support Vector Machines , 1998, NIPS.
[179] M. Degroot. Optimal Statistical Decisions , 1970 .
[180] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[181] Alan M. Frieze,et al. A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions , 1996, Algorithmica.
[182] Kilian Q. Weinberger,et al. Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization , 2005, AISTATS.
[183] Rayid Ghani,et al. Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.
[184] B. Schwikowski,et al. A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.
[185] Leslie Lamport,et al. How to Write a Proof , 1995 .
[186] Tijl De Bie,et al. Eigenproblems in Pattern Recognition , 2005 .
[187] Mikhail Belkin,et al. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.
[188] Rong Jin,et al. Learning with Multiple Labels , 2002, NIPS.
[189] Y. Abu-Mostafa. Machines that Learn from Hints , 1995 .
[190] Jianhua Lin,et al. Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.
[191] Yoshua Bengio,et al. Greedy Spectral Embedding , 2005, AISTATS.
[192] G. Celeux,et al. A Classification EM algorithm for clustering and two stochastic versions , 1992 .
[193] D. Pe’er,et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.
[194] B. Snel,et al. Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.
[195] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.
[196] Avrim Blum,et al. Learning an intersection of k halfspaces over a uniform distribution , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.
[197] Byoung-Tak Zhang,et al. Large Scale Unstructured Document Classification Using Unlabeled Data and Syntactic Information , 2003, PAKDD.
[198] Lawrence K. Saul,et al. Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..
[199] David Haussler,et al. A Discriminative Framework for Detecting Remote Protein Homologies , 2000, J. Comput. Biol..
[200] Tommi S. Jaakkola,et al. Partially labeled classification with Markov random walks , 2001, NIPS.
[201] Lei Wang,et al. Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..
[202] Kilian Q. Weinberger,et al. Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[203] R. Tibshirani,et al. Generalized additive models for medical research , 1986, Statistical methods in medical research.
[204] Richard E. Blahut,et al. Computation of channel capacity and rate-distortion functions , 1972, IEEE Trans. Inf. Theory.
[205] John Langford,et al. PAC-MDL Bounds , 2003, COLT.
[206] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.
[207] Dan Roth,et al. On the Hardness of Approximate Reasoning , 1993, IJCAI.
[208] Andrew W. Moore,et al. 'N-Body' Problems in Statistical Learning , 2000, NIPS.
[209] Balázs Kégl,et al. Boosting on Manifolds: Adaptive Regularization of Base Classifiers , 2004, NIPS.
[210] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.
[211] Bernhard Schölkopf,et al. A kernel view of the dimensionality reduction of manifolds , 2004, ICML.
[212] Michael Ruogu Zhang,et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.
[213] Arindam Banerjee,et al. Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.
[214] Alexander Gammerman,et al. Transduction with Confidence and Credibility , 1999, IJCAI.
[215] D. Hosmer. A Comparison of Iterative Maximum Likelihood Estimates of the Parameters of a Mixture of Two Normal Distributions Under Three Different Types of Sample , 1973 .
[216] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[217] Cullen Schaffer,et al. A Conservation Law for Generalization Performance , 1994, ICML.
[218] Dan Roth,et al. Understanding Probabilistic Classifiers , 2001, ECML.
[219] D. W. Scott,et al. Nonparametric Estimation of Probability Densities and Regression Curves , 1988 .
[220] Ben Taskar,et al. Discriminative Probabilistic Models for Relational Data , 2002, UAI.
[221] Van Rijsbergen,et al. A theoretical basis for the use of co-occurence data in information retrieval , 1977 .
[222] Shahar Mendelson,et al. Random Subclass Bounds , 2003, COLT.
[223] William Stafford Noble,et al. Learning kernels from biological networks by maximizing entropy , 2004, ISMB/ECCB.
[224] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[225] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[226] B. Alberts,et al. An Introduction to the Molecular Biology of the Cell , 1998 .
[227] J. Heinonen,et al. Nonlinear Potential Theory of Degenerate Elliptic Equations , 1993 .
[228] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[229] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[230] Boaz Leskes,et al. The Value of Agreement, a New Boosting Algorithm , 2005, COLT.
[231] Sanjoy Dasgupta,et al. PAC Generalization Bounds for Co-training , 2001, NIPS.
[232] Geoffrey C. Fox,et al. A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..
[233] Bernhard Schölkopf,et al. Feature selection and transduction for prediction of molecular bioactivity for drug design , 2003, Bioinform..
[234] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..
[235] G J Barton,et al. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.
[236] Risi Kondor,et al. Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.
[237] Karen Sparck Jones. A statistical interpretation of term specificity and its application in retrieval , 1972 .
[238] Rayid Ghani,et al. Combining Labeled and Unlabeled Data for MultiClass Text Categorization , 2002, ICML.
[239] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[240] Santosh S. Vempala,et al. Optimal outlier removal in high-dimensional spaces , 2004, J. Comput. Syst. Sci..
[241] Naonori Ueda,et al. Deterministic Annealing Variant of the EM Algorithm , 1994, NIPS.
[242] Yoshua Bengio,et al. Semi-supervised Learning by Entropy Minimization , 2004, CAP.
[243] Yoshihiro Yamanishi,et al. Supervised Graph Inference , 2004, NIPS.
[244] G. McLachlan,et al. Small sample results for a linear discriminant function estimated from a mixture of normal populations , 1979 .
[245] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.
[246] Pascal Vincent,et al. Non-Local Manifold Parzen Windows , 2005, NIPS.
[247] Miguel F. Anjos,et al. New Convex Relaxations for the Maximum Cut and VLSI Layout Problems , 2001 .
[248] Alessandro Vespignani,et al. Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.
[249] Dan Klein,et al. Spectral Learning , 2003, IJCAI.
[250] J. Anderson. Multivariate logistic compounds , 1979 .
[251] Nicolas Le Roux,et al. The Curse of Highly Variable Functions for Local Kernel Machines , 2005, NIPS.
[252] Vikas Sindhwani,et al. On Manifold Regularization , 2005, AISTATS.
[253] Leslie G. Valiant,et al. A general lower bound on the number of examples needed for learning , 1988, COLT '88.
[254] Rayid Ghani,et al. Combining labeled and unlabeled data for text classification with a large number of categories , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[255] Joachim M. Buhmann,et al. Clustering with the Connectivity Kernel , 2003, NIPS.
[256] Eric B. Baum,et al. Polynomial time algorithms for learning neural nets , 1990, Annual Conference Computational Learning Theory.
[257] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .
[258] Gene H. Golub,et al. Matrix computations , 1983 .
[259] H. O. Hartley,et al. Classification and Estimation in Analysis of Variance Problems , 1968 .
[260] ASHOK K. AGRAWALA,et al. Learning with a probabilistic teacher , 1970, IEEE Trans. Inf. Theory.
[261] A. N. Tikhonov,et al. Solutions of ill-posed problems , 1977 .
[262] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[263] Weiru Liu,et al. Learning belief networks from data: an information theory based approach , 1997, CIKM '97.
[264] Tom Minka,et al. A family of algorithms for approximate Bayesian inference , 2001 .
[265] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.
[266] Philip M. Long,et al. Performance guarantees for hierarchical clustering , 2002, J. Comput. Syst. Sci..
[267] G. McLachlan. Estimating the Linear Discriminant Function from Initial Samples Containing a Small Number of Unclassified Observations , 1977 .
[268] Yishay Mansour,et al. An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering , 1997, UAI.
[269] L. Csató. Gaussian processes:iterative sparse approximations , 2002 .
[270] Mikhail Belkin,et al. Regularization and Semi-supervised Learning on Large Graphs , 2004, COLT.
[271] David J. Miller,et al. A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data , 1996, NIPS.
[272] Nir Friedman,et al. Bayesian Network Classifiers , 1997, Machine Learning.
[273] Jason Weston,et al. Multi-class protein fold recognition using adaptive codes , 2005, ICML.
[274] Cullen Schaffer. Overfitting avoidance as bias , 2004, Machine Learning.
[275] Raymond J. Mooney,et al. Adaptive duplicate detection using learnable string similarity measures , 2003, KDD '03.
[276] Tommi S. Jaakkola,et al. Information Regularization with Partially Labeled Data , 2002, NIPS.
[277] Jon Louis Bentley,et al. An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.
[278] Olivier Chapelle,et al. Model Selection for Support Vector Machines , 1999, NIPS.
[279] Andrew McCallum,et al. Semi-Supervised Clustering with User Feedback , 2003 .
[280] Peter Sollich. Probabilistic interpretations and Bayesian methods for support vector machines , 1999 .
[281] Nicolas Le Roux,et al. Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.
[282] Éva Tardos,et al. Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).
[283] Nicu Sebe,et al. Learning Bayesian network classifiers for facial expression recognition both labeled and unlabeled data , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..
[284] Michael I. Jordan,et al. Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.
[285] S. Boucheron,et al. A sharp concentration inequality with applications , 1999, Random Struct. Algorithms.
[286] Jos F. Sturm,et al. A Matlab toolbox for optimization over symmetric cones , 1999 .
[287] Alexander Zien,et al. Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.
[288] S. Boucheron,et al. Theory of classification : a survey of some recent advances , 2005 .
[289] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[290] Jitendra Malik,et al. Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[291] Jean-Philippe Vert,et al. Graph-Driven Feature Extraction From Microarray Data Using Diffusion Kernels and Kernel CCA , 2002, NIPS.
[292] Russell Greiner,et al. Model Selection Criteria for Learning Belief Nets: An Empirical Comparison , 2000, ICML.
[293] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[294] Fabio Gagliardi Cozman,et al. Semi-Supervised Learning of Mixture Models , 2003, ICML.
[295] T. Cover,et al. The relative value of labeled and unlabeled samples in pattern recognition , 1993, Proceedings. IEEE International Symposium on Information Theory.
[296] Yoshua Bengio,et al. Non-Local Manifold Tangent Learning , 2004, NIPS.
[297] Thomas Hofmann,et al. Semi-supervised Learning on Directed Graphs , 2004, NIPS.
[298] Claire Cardie,et al. Limitations of Co-Training for Natural Language Learning from Large Datasets , 2001, EMNLP.
[299] Massih-Reza Amini,et al. Semi Supervised Logistic Regression , 2002, ECAI.
[300] Joshua B. Tenenbaum,et al. Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.
[301] Bernhard Schölkopf,et al. Learning from labeled and unlabeled data on a directed graph , 2005, ICML.
[302] James A. Sethian,et al. Level Set Methods and Fast Marching Methods , 1999 .
[303] Bernhard Schölkopf,et al. Dynamic Alignment Kernels , 2000 .
[304] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.
[305] Johan A. K. Suykens,et al. Learning from General Label Constraints , 2004, SSPR/SPR.
[306] R. Mooney,et al. Impact of Similarity Measures on Web-page Clustering , 2000 .
[307] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.
[308] Stephen P. Boyd,et al. Semidefinite Programming , 1996, SIAM Rev..
[309] Takeo Kanade,et al. Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).
[310] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.
[311] Ulrike von Luxburg,et al. Distance-Based Classification with Lipschitz Functions , 2004, J. Mach. Learn. Res..
[312] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .
[313] Klaus Obermayer,et al. Bayesian Transduction , 1999, NIPS.
[314] Jason Weston,et al. Transductive Inference for Estimating Values of Functions , 1999, NIPS.
[315] Yair Weiss,et al. Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[316] K. Bennett,et al. Optimization Approaches to Semi-Supervised Learning , 2001 .
[317] C. J. Stone,et al. Optimal Rates of Convergence for Nonparametric Estimators , 1980 .
[318] David W. Opitz,et al. Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.
[319] Inderjit S. Dhillon,et al. Information theoretic clustering of sparse cooccurrence data , 2003, Third IEEE International Conference on Data Mining.
[320] J. M. Hammersley,et al. Markov fields on finite graphs and lattices , 1971 .
[321] Mikhail Belkin,et al. Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.
[322] Mikhail Belkin,et al. Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.
[323] Tom M. Mitchell,et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.
[324] David B. Shmoys,et al. A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..
[325] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..
[326] Matthew Brand,et al. Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.
[327] Nello Cristianini,et al. Classification using String Kernels , 2000 .
[328] John D. Lafferty,et al. Semi-supervised learning using randomized mincuts , 2004, ICML.
[329] Arindam Banerjee,et al. Semi-supervised Clustering by Seeding , 2002, ICML.
[330] Jean-Michel Renders,et al. Combining Labelled and Unlabelled Data: A Case Study on Fisher Kernels and Transductive Inference for Biological Entity Recognition , 2002, CoNLL.
[331] Mikhail Belkin,et al. Maximum Margin Semi-Supervised Learning for Structured Variables , 2005, NIPS 2005.
[332] Robert E. Schapire,et al. Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.
[333] Nello Cristianini,et al. Efficiently Learning the Metric with Side-Information , 2003, ALT.
[334] Dean P. Foster,et al. The risk inflation criterion for multiple regression , 1994 .
[335] Wray L. Buntine. Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..
[336] N. E. Day. Estimating the components of a mixture of normal distributions , 1969 .
[337] Nello Cristianini,et al. Spectral Kernel Methods for Clustering , 2001, NIPS.
[338] O. Mangasarian,et al. Semi-Supervised Support Vector Machines for Unlabeled Data Classification , 2001 .
[339] Stephen P. Boyd,et al. The Fastest Mixing Markov Process on a Graph and a Connection to a Maximum Variance Unfolding Problem , 2006, SIAM Rev..
[340] N Linial,et al. ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space , 1999, Proteins.
[341] Yoram Singer,et al. Log-Linear Models for Label Ranking , 2003, NIPS.
[342] Christopher K. I. Williams. On a Connection between Kernel PCA and Metric Multidimensional Scaling , 2004, Machine Learning.
[343] Tony Jebara,et al. Probability Product Kernels , 2004, J. Mach. Learn. Res..