Learning Multi-Instance Deep Ranking and Regression Network for Visual House Appraisal

This paper presents a weakly supervised regression model for the visual house appraisal problem, which aims to predict the value of a house from its photos and textual descriptions (e.g., number of bedrooms). The key idea of our approach is a multi-layer neural network, called multi-instance Deep Ranking and Regression (MiDRR) net, which jointly solves two coupled tasks: ranking and regression, in the multiple instance setting. The network is trained using weakly supervised data, which do not require intensive human annotations. We also design a set of human heuristics to promote deep features through imposing constraints over the solution space, e.g., a house with three bedrooms often has a higher value than that with only two bedrooms. While these constraints are specific to the studied problem, the developed formula can be easily generalized to the other regression applications. For test and evaluation purposes, we collect a comprehensive house image benchmark that includes 900,000 photos from 30,000 houses recently traded in the USA, and apply the proposed MiDRR net to predict house values. Extensive evaluations with comparisons demonstrate that additional usage of imagery data as well as human heuristics can significantly boost system performance and that the proposed MiDRR net clearly outperforms the alternative methods.

[1]  Eric R. Ziegel,et al.  Generalized Linear Models:Generalized Linear Models , 2002 .

[2]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[3]  Xiaoming Zhang,et al.  Kernel Discriminant Learning for Ordinal Regression , 2010, IEEE Transactions on Knowledge and Data Engineering.

[4]  Zheru Chi,et al.  Genetic evolution processing of data structures for image classification , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Lorenzo Porzi,et al.  Predicting and Understanding Urban Perception with Convolutional Neural Networks , 2015, ACM Multimedia.

[7]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[8]  Edward W. Wild,et al.  Multiple Instance Classification via Successive Linear Programming , 2008 .

[9]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Misha Denil,et al.  Deep Multi-Instance Transfer Learning , 2014, ArXiv.

[12]  Pedro Antonio Gutiérrez,et al.  Ordinal Regression Methods: Survey and Experimental Study , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Kristin P. Bennett,et al.  Multiple instance ranking , 2008, ICML '08.

[14]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[15]  Zhen Li,et al.  A Comparative Study of Mobile-Based Landmark Recognition Techniques , 2010, IEEE Intelligent Systems.

[16]  Yangqing Jia,et al.  Deep Convolutional Ranking for Multilabel Image Annotation , 2013, ICLR.

[17]  Nassir Navab,et al.  Robust Optimization for Deep Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Sabina Alkire,et al.  Subjective Quantitative Studies of Human Agency , 2005 .

[19]  Jiajun Wu,et al.  Deep multiple instance learning for image classification and auto-annotation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Changsheng Xu,et al.  Hi, magic closet, tell me what to wear! , 2012, ACM Multimedia.

[21]  Artur S. d'Avila Garcez,et al.  Multi-instance learning using recurrent neural networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[22]  Yiqiang Chen,et al.  Multidimensional Vector Regression for Accurate and Low-Cost Location Estimation in Pervasive Computing , 2006, IEEE Transactions on Knowledge and Data Engineering.

[23]  Song-Chun Zhu,et al.  Visual Persuasion: Inferring Communicative Intents of Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[27]  Yang Song,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Zhi-Hua Zhou,et al.  Neural Networks for Multi-Instance Learning , 2002 .

[29]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[30]  Zhuowen Tu,et al.  MILCut: A Sweeping Line Multiple Instance Learning Paradigm for Interactive Image Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  D. Sculley,et al.  Combined regression and ranking , 2010, KDD.

[32]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[33]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[34]  D. Basak,et al.  Support Vector Regression , 2008 .

[35]  Yi Yang,et al.  Ranking with local regression and global alignment for cross media retrieval , 2009, ACM Multimedia.

[36]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[37]  Jan Ramon,et al.  Multi instance neural networks , 2000, ICML 2000.

[38]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[39]  Kristen Grauman,et al.  Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.