What Image Features Boost Housing Market Predictions?

The attractiveness of a property is one of the most interesting, yet challenging, categories to model. Image characteristics are used to describe certain attributes, and to examine the influence of visual factors on the price or timeframe of the listing. In this paper, we propose a set of techniques for the extraction of visual features for efficient numerical inclusion in modern-day predictive algorithms. We discuss techniques such as Shannon's entropy, calculating the center of gravity, employing image segmentation, and using Convolutional Neural Networks. After comparing these techniques as applied to a set of property-related images (indoor, outdoor, and satellite), we conclude the following: (i) the entropy is the most efficient single-digit visual measure for housing price prediction; (ii) image segmentation is the most important visual feature for the prediction of housing lifespan; and (iii) deep image features can be used to quantify interior characteristics and contribute to captivation modeling. The set of 40 image features selected here carries a significant amount of predictive power and outperforms some of the strongest metadata predictors. Without any need to replace a human expert in a real-estate appraisal process, we conclude that the techniques presented in this paper can efficiently describe visible characteristics, thus introducing perceived attractiveness as a quantitative measure into the predictive modeling of housing.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Jiebo Luo,et al.  Image-Based Appraisal of Real Estate Properties , 2016, IEEE Transactions on Multimedia.

[3]  Junjie Wu,et al.  Advances in K-means clustering: a data mining thinking , 2012 .

[4]  Anna Veronika Dorogush,et al.  CatBoost: gradient boosting with categorical features support , 2018, ArXiv.

[5]  Yoonseok Shin,et al.  Application of Boosting Regression Trees to Preliminary Cost Estimation in Building Construction Projects , 2015, Comput. Intell. Neurosci..

[6]  André L. V. Coelho,et al.  Ensembling Heterogeneous Learning Models with Boosting , 2009, ICONIP.

[7]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[8]  B. S. Manjunath,et al.  Beyond Spatial Auto-Regressive Models: Predicting Housing Prices with Satellite Imagery , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[9]  Du-Yih Tsai,et al.  Information Entropy Measure for Evaluation of Image Quality , 2008, Journal of Digital Imaging.

[10]  Byoungkwon An,et al.  Looking Beyond the Visible Scene , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Brooks Paige,et al.  Take a Look Around , 2018, ACM Trans. Intell. Syst. Technol..

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Jorge Cadima,et al.  Principal component analysis: a review and recent developments , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[14]  Serge J. Belongie,et al.  Vision-based real estate price estimation , 2017, Machine Vision and Applications.

[15]  Jun S. Liu,et al.  Interpretable Selection and Visualization of Features and Interactions Using Bayesian Forests , 2015, 1506.02371.

[16]  Justin D. Benefield,et al.  On the Relationship Between Property Price, Time-on-Market, and Photo Depictions in a Multiple Listing Service , 2011 .

[17]  Zona Kostić,et al.  Stacking Ensemble Approach for Combining Different Methods in Real Estate Prediction , 2018 .

[18]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[19]  Alexei A. Efros,et al.  City Forensics: Using Visual Elements to Predict Non-Visual City Attributes , 2014, IEEE Transactions on Visualization and Computer Graphics.

[20]  Rencai Dong,et al.  Impacts of Street-Visible Greenery on Housing Prices: Evidence from a Hedonic Price Model and a Massive Street View Image Dataset in Beijing , 2018, ISPRS Int. J. Geo Inf..

[21]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Bolei Zhou,et al.  Places: An Image Database for Deep Scene Understanding , 2016, ArXiv.

[23]  Keith E. Wardrip Public Transit ’ s Impact on Housing Costs : A Review of the Literature , 2011 .

[24]  Alan K. Reichert Hedonic Modeling in Real Estate Appraisal: The Case of Environmental Damages Assessment , 2002 .

[25]  Taek-Soo Kim,et al.  A statistical model for user preference , 2005, IEEE Transactions on Knowledge and Data Engineering.

[26]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Olivier Gibaru,et al.  CNN features are also great at unsupervised classification , 2017, ArXiv.

[28]  Xiaoming Bai,et al.  Boosting Learning Algorithm for Stock Price Forecasting , 2018 .

[29]  Yan Wang,et al.  Prediction on Housing Price Based on Deep Learning , 2018 .

[30]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[31]  Edward Chou,et al.  AI Blue Book: Vehicle Price Prediction using Visual Features , 2018, 1803.11227.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Geoffrey Goodwin Yet Another Paradigm? , 1978 .

[34]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[35]  R. Pace,et al.  Spatial Autoregression Techniques for Real Estate Data , 1999 .

[36]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .