Stuctured machine learning problems in natural language processing Michael Collins, MIT CSAIL/EECS Many problems in natural language processing involve the mapping from strings to structured objects such as parse trees, underlying state sequences, or segmentations. This leads to an interesting class of learning problems: how to induce classification functions where the output "labels" have meaningful internal structure, and where the number of possible labels may grow exponentially with the size of the input strings. Probabilistic grammars -for example hidden markov models or probabilistic context-free grammars -are one common approach to this type of problem. In this talk I will describe recent work on alternatives to HMMs and PCFGs, based on generalizations of binary classification algorithms such as boosting, the perceptron algorithm, or large-margin (SVM) methods. Statistical Models for Social Networks Mark Handcock, University of Washington This talk is an overview of social network analysis from the perspective of a statistician. The main focus is on the conceptual and methodological contributions of the social network community going back over eighty years. The field is, and has been, broadly multidisciplinary with significant contributions from the social, natural and mathematical sciences. This has lead to a plethora of terminology, and network conceptualizations commensurate with the varied objectives of network analysis. As a primary focus of the social sciences has been the representation of social relations with the objective of understanding social structure, social scientists have been central to this development. We review statistical exponential family models that recognize the complex dependencies within relational data structures. We consider three issues: the specification of realistic models, the algorithmic difficulties of the inferential methods, and the assessment of the degree to which the graph structure produced by the models matches that of the data. Insight can be gained by considering model degeneracy and inferential degeneracy for commonly used estimators. Probabilistic Entity-Relationship Models, PRMs, and Plate Models David Heckerman, Microsoft Research We introduce a graphical language for relational data called the probabilistic entity-relationship (PER) model. The model is an extension of the entityrelationship model, a common model for the abstract representation of database structure. We concentrate on the directed version of this model---the directed acyclic probabilistic entity-relationship (DAPER) model. The DAPER model is closely related to the plate model and the probabilistic relational model (PRM), existing models for relational data. The DAPER model is more expressive than either existing model, and also helps to demonstrate their similarity. In addition to describing the new language, we discuss important facets of modeling relational data, including the use of restricted relationships, self relationships, and probabilistic relationships. This is joint work with Christopher Meek and Daphne Koller. Pictorial Structure Models for Visual Recognition Dan Huttenlocher, Cornell University There has been considerable recent work in object recognition on representations that combine both local visual appearance and global spatial constraints. Several such approaches are based on statistical characterizations of the spatial relations between local image patches. In this talk I will give an overview of one such approach, called pictorial structures, which uses spatial relations between pairs of parts. I will focus on the recent development of highly efficient techniques both for learning certain forms of pictorial structure models from examples and for detecting objects using these models. Relations, generalizations and the reference-class problem: A logic programming / Bayesian perspective David Poole, Dept of Computer Science, University of British Columbia Logic programs provide a rich language to specify the interdependence between relations. There has been much success with inductive logic programming finding relationships from data. There has also been considerable success with Bayesian learning. However there is a large conceptual gap in that inductive logic programming does not have any statistics. This talk will explore how to get statistics from data. This problem is known as the reference-class problem. This talk will explore the combination of logic programming and hierarchical Bayesian models as a solution to the reference class problem. This is joint work with Michael Chiang. Feature Definition and Discovery in Probabilistic Relational Models Eric Altendorf eric@cleverset.com Bruce D’Ambrosio dambrosi@cleverset.com CleverSet, Inc., 673 Jackson Avenue, Corvallis OR, 97330
[1]
Mark E. J. Newman.
A measure of betweenness centrality based on random walks
,
2005,
Soc. Networks.
[2]
Luc De Raedt,et al.
Inductive Logic Programming: Theory and Methods
,
1994,
J. Log. Program..
[3]
Werner Nutt,et al.
Deciding equivalences among aggregate queries
,
1998,
PODS '98.
[4]
Eric Brill,et al.
Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging
,
1995,
CL.
[5]
Foster J. Provost,et al.
Aggregation-based feature invention and relational concept classes
,
2003,
KDD '03.
[6]
David M. Pennock,et al.
Statistical relational learning for document mining
,
2003,
Third IEEE International Conference on Data Mining.
[7]
Fan Chung,et al.
Spectral Graph Theory
,
1996
.
[8]
Alfred V. Aho,et al.
Equivalences Among Relational Expressions
,
1979,
SIAM J. Comput..
[9]
Bernhard Schölkopf,et al.
Learning with Local and Global Consistency
,
2003,
NIPS.
[10]
Bernhard Schölkopf,et al.
A kernel view of the dimensionality reduction of manifolds
,
2004,
ICML.
[11]
G. Wahba.
Spline models for observational data
,
1990
.
[12]
Peter A. Flach,et al.
Propositionalization approaches to relational data mining
,
2001
.
[13]
M. Kirsten,et al.
Distance based approaches to relational learning and clustering
,
2001
.
[14]
David A. Cohn,et al.
The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity
,
2000,
NIPS.
[15]
Christopher M. Bishop,et al.
GTM: The Generative Topographic Mapping
,
1998,
Neural Computation.
[16]
Sergey Brin,et al.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
,
1998,
Comput. Networks.
[17]
Michael McGill,et al.
Introduction to Modern Information Retrieval
,
1983
.
[18]
Lise Getoor,et al.
Learning Probabilistic Relational Models
,
1999,
IJCAI.
[19]
Jon Kleinberg,et al.
Authoritative sources in a hyperlinked environment
,
1999,
SODA '98.
[20]
Bernhard Schölkopf,et al.
Ranking on Data Manifolds
,
2003,
NIPS.
[21]
Elias Pampalk,et al.
Content-based organization and visualization of music archives
,
2002,
MULTIMEDIA '02.
[22]
Nello Cristianini,et al.
Composite Kernels for Hypertext Categorisation
,
2001,
ICML.
[23]
Samuel Kaski,et al.
Self organization of a massive document collection
,
2000,
IEEE Trans. Neural Networks Learn. Syst..
[24]
Bernhard Schölkopf,et al.
Estimating the Support of a High-Dimensional Distribution
,
2001,
Neural Computation.
[25]
S T Roweis,et al.
Nonlinear dimensionality reduction by locally linear embedding.
,
2000,
Science.
[26]
Rohit J. Kate,et al.
Comparative experiments on learning information extractors for proteins and their interactions
,
2005,
Artif. Intell. Medicine.
[27]
C. Lee Giles,et al.
Self-Organization and Identification of Web Communities
,
2002,
Computer.
[28]
Mark Craven,et al.
Relational Learning with Statistical Predicate Invention: Better Models for Hypertext
,
2001,
Machine Learning.
[29]
David M. Pennock,et al.
Mixtures of Conditional Maximum Entropy Models
,
2003,
ICML.
[30]
Ali S. Hadi,et al.
Finding Groups in Data: An Introduction to Chster Analysis
,
1991
.
[31]
Lyle H. Ungar,et al.
Statistical Relational Learning for Link Prediction
,
2003
.
[32]
Nada Lavrač,et al.
An Introduction to Inductive Logic Programming
,
2001
.
[33]
Lyle H. Ungar,et al.
Structural Logistic Regression for Link Analysis
,
2003
.