Towards Miss Universe automatic prediction: The evening gown competition

Can we predict the winner of Miss Universe after watching how they stride down the catwalk during the evening gown competition? Fashion gurus say they can! In our work, we study this question from the perspective of computer vision. In particular, we want to understand whether existing computer vision approaches can be used to automatically extract the qualities exhibited by the Miss Universe winners during their catwalk. This study can pave the way towards new vision-based applications for the fashion industry. To this end, we propose a novel video dataset, called the Miss Universe dataset, comprising 10 years of the evening gown competition selected between 1996–2010. We further propose two ranking-related problems: (1) Miss Universe Listwise Ranking and (2) Miss Universe Pairwise Ranking. In addition, we also develop an approach that simultaneously addresses the two proposed problems. To describe the videos we employ the recently proposed Stacked Fisher Vectors in conjunction with robust local spatio-temporal features. From our evaluation we found that although the addressed problems are extremely challenging, the proposed system is able to rank the winner in the top 3 best predicted scores for 5 out of 10 Miss Universe competitions.

[1]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[2]  Cordelia Schmid,et al.  P-CNN: Pose-Based CNN Features for Action Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Elsayed E. Hemayed,et al.  Human action recognition using trajectory-based representation , 2015 .

[4]  Nan Sang,et al.  Human action recognition by fast dense trajectories , 2013, MM '13.

[5]  Nassir Navab,et al.  Extended Co-occurrence HOG with Dense Trajectories for Fine-Grained Activity Recognition , 2014, ACCV.

[6]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[7]  S. Sathiya Keerthi,et al.  Efficient algorithms for ranking with SVMs , 2010, Information Retrieval.

[8]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[9]  Chih-Jen Lin,et al.  Large-Scale Linear RankSVM , 2014, Neural Computation.

[10]  Tal Hassner,et al.  Motion Interchange Patterns for Action Recognition in Unconstrained Videos , 2012, ECCV.

[11]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[12]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[13]  Bernt Schiele,et al.  A database for fine grained activity detection of cooking activities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[15]  Yu Qiao,et al.  Action Recognition with Stacked Fisher Vectors , 2014, ECCV.

[16]  Wassim M. Haddad,et al.  Gait Assessment for Multiple Sclerosis Patients Using Microsoft Kinect , 2015, ArXiv.

[17]  Bernt Schiele,et al.  Fine-Grained Activity Recognition with Holistic and Pose Based Features , 2014, GCPR.

[18]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Mubarak Shah,et al.  Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jean Meunier,et al.  Gait Analysis from Video: Camcorders vs. Kinect , 2014, ICIAR.

[21]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[22]  Janusz Konrad,et al.  Action Recognition From Video Using Feature Covariance Matrices , 2013, IEEE Transactions on Image Processing.

[23]  Tie-Yan Liu,et al.  Ranking Measures and Loss Functions in Learning to Rank , 2009, NIPS.

[24]  Brian C. Lovell,et al.  Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients , 2014, MLSDA'14.

[25]  James M. Keller,et al.  Toward a Passive Low-Cost In-Home Gait Assessment System for Older Adults , 2013, IEEE Journal of Biomedical and Health Informatics.

[26]  Nuria Oliver,et al.  Data Mining Methods for Recommender Systems , 2015, Recommender Systems Handbook.

[27]  Xavier Amatriain,et al.  Data Mining Methods for Recommender Systems , 2011, Recommender Systems Handbook.

[28]  Conrad Sanderson,et al.  Armadillo: a template-based C++ library for linear algebra , 2016, J. Open Source Softw..

[29]  Alexander C. Berg,et al.  Hipster Wars: Discovering Elements of Fashion Styles , 2014, ECCV.

[30]  Brian C. Lovell,et al.  Joint Recognition and Segmentation of Actions via Probabilistic Integration of Spatio-Temporal Fisher Vectors , 2016, PAKDD Workshops.

[31]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[32]  Antonio Torralba,et al.  Assessing the Quality of Actions , 2014, ECCV.