Construction of Training Data for Price Prediction of a Real Estate from Internet Ads

The paper presents a model for constructing a data set aimed at predicting a price of a real estate (houses and flats) from the standard Internet ads. The model for predicting a real estate price includes, in addition to standard real estate's features (area, number of bedrooms, etc.) appearing in ad, attractiveness of a real estate location as well as information on some additional interior facilities (e.g., refrigerator, dish-washing machine, stove, etc.). The proposed training set construction model uses OpenStreetMap's Overpass API for determining attractiveness of a real estate's location, and a convolution neural network for detecting interior facilities from real estate photos.