Learning representations for improved target identification, scene classification, and information fusion

Object representation is fundamental to Automated Target Recognition (ATR). Many ATR approaches choose a basis, such as a wavelet or Fourier basis, to represent the target. Recently, advancements in Image and Signal processing have shown that object recognition can be improved if, rather than a assuming a basis, a database of training examples is used to learn a representation. We discuss learning representations using Non-parametric Bayesian topic models, and demonstrate how to integrate information from other sources to improve ATR. We apply the method to EO and IR information integration for vehicle target identification and show that the learned representation of the joint EO and IR information improves target identification by 4%. Furthermore, we demonstrate that we can integrate text and imagery data to direct the representation for mission specific tasks and improve performance by 8%. Finally, we illustrate integrating graphical models into representation learning to improve performance by 2%.

[1]  M. Clyde,et al.  Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels , 2011, 1112.3149.

[2]  David B. Dunson,et al.  Nonparametric Bayesian Dictionary Learning for Analysis of Noisy and Incomplete Images , 2012, IEEE Transactions on Image Processing.

[3]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[4]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[5]  Deanna Needell,et al.  Improving image clustering using sparse text and the wisdom of the crowds , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[6]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[7]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[8]  Tony Lindeberg,et al.  Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention , 1993, International Journal of Computer Vision.

[9]  David B. Dunson,et al.  The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning , 2011, ICML.

[10]  Shimon Ullman,et al.  Object recognition with informative features and linear classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[13]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[14]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[16]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[17]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Arjuna Flenner,et al.  Diffuse Interface Models on Graphs for Classification of High Dimensional Data , 2012, SIAM Rev..

[19]  T. Ferguson Prior Distributions on Spaces of Probability Measures , 1974 .

[20]  David B. Dunson,et al.  A Bayesian Model for Simultaneous Image Clustering, Annotation and Object Segmentation , 2009, NIPS.

[21]  Arjuna Flenner,et al.  Diffuse interface methods for multiclass segmentation of high-dimensional data , 2014, Appl. Math. Lett..

[22]  David M Blei,et al.  Efficient discovery of overlapping communities in massive networks , 2013, Proceedings of the National Academy of Sciences.

[23]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[24]  Guillermo Sapiro,et al.  Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations , 2009, NIPS.

[25]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[26]  Lawrence Carin,et al.  Nonparametric factor analysis with beta process priors , 2009, ICML '09.

[27]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.

[28]  David G. Lowe,et al.  Shape Descriptors for Maximally Stable Extremal Regions , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  Guoliang Fan,et al.  Joint target tracking and recognition using view and identity manifolds , 2011, CVPR 2011 WORKSHOPS.

[30]  Stefano Soatto,et al.  Domain-size pooling in local descriptors: DSP-SIFT , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[32]  Guillermo Sapiro,et al.  On the Integration of Topic Modeling and Dictionary Learning , 2011, ICML.

[33]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[34]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[35]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[36]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[37]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[39]  Volkan Cevher,et al.  Sparse Signal Recovery and Acquisition with Graphical Models , 2010, IEEE Signal Processing Magazine.

[40]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[41]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[42]  Fei-Fei Li,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, CVPR.

[43]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[44]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[45]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  ZissermanAndrew,et al.  The Pascal Visual Object Classes Challenge , 2015 .

[47]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[49]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[50]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[51]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[52]  Radford M. Neal Bayesian Mixture Modeling , 1992 .

[53]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[54]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[55]  Arjuna Flenner,et al.  Multiclass Data Segmentation Using Diffuse Interface Methods on Graphs , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Lionel Moisan,et al.  Edge Detection by Helmholtz Principle , 2001, Journal of Mathematical Imaging and Vision.

[57]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[58]  Robert L. Wolpert,et al.  Nonparametric Function Estimation Using Overcomplete Dictionaries , 2006 .