Adversarial Sequence Tagging

Providing sequence tagging that minimize Hamming loss is a challenging, but important, task. Directly minimizing this loss over a training sample is generally an NP-hard problem. Instead, existing sequence tagging methods minimize a convex upper bound that upper bounds the Hamming loss. Unfortunately, this often either leads to inconsistent predictors (e.g., max-margin methods) or predictions that are mismatched on the Hamming loss (e.g., conditional random fields). We present adversarial sequence tagging, a consistent structured prediction framework for minimizing Hamming loss by pessimistically viewing uncertainty. Our approach pessimistically approximates the training data, yielding an adversarial game between the sequence tag predictor and the sequence labeler. We demonstrate the benefits of the approach on activity recognition and information extraction/segmentation tasks.

[1]  Christian Laugier,et al.  The International Journal of Robotics Research (IJRR) - Special issue on ``Field and Service Robotics '' , 2009 .

[2]  Henry A. Kautz,et al.  Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields , 2007, Int. J. Robotics Res..

[3]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[4]  Hans Ulrich Simon,et al.  Robust Trainability of Single Neurons , 1995, J. Comput. Syst. Sci..

[5]  Fabio Roli,et al.  Multiple classifier systems for robust classifier design in adversarial environments , 2010, Int. J. Mach. Learn. Cybern..

[6]  A. Copeland Review: John von Neumann and Oskar Morgenstern, Theory of games and economic behavior , 1945 .

[7]  Hong Wang,et al.  Adversarial Prediction Games for Multivariate Losses , 2015, NIPS.

[8]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[9]  Christopher Meek,et al.  Adversarial learning , 2005, KDD '05.

[10]  M. Panella Associate Editor of the Journal of Computer and System Sciences , 2014 .

[11]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[12]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[13]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[14]  Brian D. Ziebart,et al.  Adversarial Cost-Sensitive Classification , 2015, UAI.

[15]  Hans Ulrich Simon,et al.  Robust Trainability of Single Neurons , 1995, J. Comput. Syst. Sci..

[16]  Flemming Topsøe,et al.  Information-theoretical optimization techniques , 1979, Kybernetika.

[17]  Brian D. Ziebart,et al.  Robust Classification Under Sample Selection Bias , 2014, NIPS.

[18]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[19]  I. Couzin,et al.  Shared decision-making drives collective movement in wild baboons , 2015, Science.

[20]  A. Dawid,et al.  Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory , 2004, math/0410076.

[21]  Davide Anguita,et al.  Transition-Aware Human Activity Recognition Using Smartphones , 2016, Neurocomputing.

[22]  Thomas Lukasiewicz MAXIMUM ENTROPY , 2000 .

[23]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[24]  Xizhao Wang,et al.  International journal of machine learning and cybernetics , 2010, Int. J. Mach. Learn. Cybern..

[25]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[26]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[27]  L. Christophorou Science , 2018, Emerging Dynamics: Science, Energy, Society and Values.

[28]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[29]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[30]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[31]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[32]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[33]  Avrim Blum,et al.  Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.

[34]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[35]  Roland Kays,et al.  Data from: Shared decision-making drives collective movement in wild baboons , 2015 .

[36]  Manuela M. Veloso,et al.  Conditional random fields for activity recognition , 2007, AAMAS '07.

[37]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[38]  Yufeng Liu,et al.  Fisher Consistency of Multicategory Support Vector Machines , 2007, AISTATS.