Content-Aware Multi-task Neural Networks for User Gender Inference Based on Social Media Images

To estimate demographic attributes such as gender and age of social media users from images posted by the users is a challenging problem because the demographic attributes are directly not shown in images. For such problem, prior approaches can be roughly separated into two types: one approach uses concept detection to detect pre-defined visual concepts which are then used as meta-data to estimate demographic attributes and the other approach directly uses content features such as Fisher Vector [19] which are extracted from images. In this paper we consider the way of combining these two approaches. We propose Multi-task Bilinear Model for integrating the detected concepts with the content features. In our proposed method, both the concept detector and the feature extractor can be jointly learned with end-to-end fashion. We evaluated the proposed method for the task of estimating user gender from Twitter images and found that it outperformed other baseline methods.

[1]  Ahmed M. Elgammal,et al.  Convolutional Models for Joint Object Categorization and Pose Estimation , 2015, ArXiv.

[2]  Markus Koch,et al.  Linking visual concept detection with viewer demographics , 2012, ICMR '12.

[3]  Benjamin Van Durme,et al.  Using Conceptual Class Attributes to Characterize Social Media Users , 2013, ACL.

[4]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[5]  Zhengyou Zhang,et al.  Improving multiview face detection with multi-task deep convolutional neural networks , 2014, IEEE Winter Conference on Applications of Computer Vision.

[6]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[7]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[8]  Faiyaz Al Zamal,et al.  Using Social Media to Infer Gender Composition of Commuter Populations , 2012, Proceedings of the International AAAI Conference on Web and Social Media.

[9]  Teruo Higashino,et al.  Twitter user profiling based on text and community mining for market analysis , 2013, Knowl. Based Syst..

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Kohei Yamamoto,et al.  Content-Based Viewer Estimation Using Image Features for Recommendation of Video Clips , 2014 .

[12]  Amaia Salvador,et al.  Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction , 2015, ASM@ACM Multimedia.

[13]  Xiaojun Ma,et al.  Gender estimation for SNS user profiling using automatic image annotation , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[14]  Subhransu Maji,et al.  Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[16]  Trevor Darrell,et al.  Mapping Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets , 2015, ArXiv.

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Antoni B. Chan,et al.  Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Tomoki Taniguchi,et al.  A Weighted Combination of Text and Image Classifiers for User Gender Inference , 2015, VL@EMNLP.