Learning to Appreciate the Aesthetic Effects of Clothing

How do people describe clothing? The words like "formal" or "casual" are usually used. However, recent works often focus on recognizing or extracting visual features (e.g., sleeve length, color distribution and clothing pattern) from clothing images accurately. How can we bridge the gap between the visual features and the aesthetic words? In this paper, we formulate this task to a novel three-level framework: visual features (VF) -image-scale space (ISS) - aesthetic words space (AWS). Leveraging the art-field image-scale space served as an intermediate layer, we first propose a Stacked Denoising Autoencoder Guided by Correlative Labels (SDAEGCL) to map the visual features to the image-scale space; and then according to the semantic distances computed by WordNet::Similarity, we map the most often used aesthetic words in online clothing shops to the image-scale space too. Employing upper-body menswear images downloaded from several global online clothing shops as experimental data, the results indicate that the proposed three-level framework can help to capture the subtle relationship between visual features and aesthetic words better compared to several baselines. To demonstrate that our three-level framework and its implementation methods are universally applicable, we finally present some interesting analyses on the fashion trend of menswear in the last 10 years.

[1]  Changsheng Li,et al.  Learning distance metric regression for facial age estimation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[2]  Lianhong Cai,et al.  Interpretable aesthetic features for affective image classification , 2013, 2013 IEEE International Conference on Image Processing.

[3]  Liang Lin,et al.  Clothing Co-parsing by Joint Image Segmentation and Labeling , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Nan Wang,et al.  Who Blocks Who: Simultaneous clothing segmentation for grouping images , 2011, 2011 International Conference on Computer Vision.

[5]  Basela Hasan,et al.  Segmentation using Deformable Spatial Priors with Application to Clothing , 2010, BMVC.

[6]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[7]  Tamara L. Berg,et al.  Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Yuusuke Kawakita,et al.  Personalized Clothing-Recommendation System Based on a Modified Bayesian Network , 2012, 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet.

[9]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[10]  Jie Tang,et al.  Modeling Emotion Influence in Image Social Networks , 2015, IEEE Transactions on Affective Computing.

[11]  Luis E. Ortiz,et al.  Retrieving Similar Styles to Parse Clothing , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Yongxin Wang,et al.  Emotional Audio-Visual Speech Synthesis Based on PAD , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[14]  Masoud Mohseni,et al.  Quantum support vector machine for big feature and big data classification , 2013, Physical review letters.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  Changsheng Xu,et al.  Matching-CNN meets KNN: Quasi-parametric human parsing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[18]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[19]  Toshiyuki Takezawa,et al.  Extraction of the Combination Rules of Colors and Derived Fashion Images Using Fashion Styling Data , 2022 .

[20]  Ju-Young M. Kang,et al.  Clothing functions and use of clothing to alter mood , 2013 .

[21]  Min Wang,et al.  Bayesian structured variable selection in linear regression models , 2014, Computational Statistics.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Henry Lieberman,et al.  What am I gonna wear?: scenario-oriented recommendation , 2007, IUI '07.

[24]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.