Ensemble Learning Based Rental Apartment Price Prediction Model by Categorical Features Factoring

Apartment rental prices are influenced by various factors. The aim of this study is to analyze the different features of an apartment and predict the rental price of it based on multiple factors. An ensemble learning based prediction model is created to reach the goal. We have used a dataset from bProperty.com which includes the rental price and different features of apartments in the city of Dhaka, Bangladesh. The results show the accuracy and prediction of the rent of an apartment, also indicates the different types of categorical values that affect the machine learning models. Another purpose of the study is to find out the factors that signify the apartment rental price in Dhaka. To help our prediction we take on the Advance Regression Techniques (ART) and compare to different features of an apartment for establishing an acceptable model. The following algorithms are selected as the base predictors -- Advance Linear Regression, Neural Network, Random Forest, Support Vector Machine (SVM) and Decision Tree Regressor. The Ensemble learning is stacked of following algorithms -- Ensemble AdaBoosting Regressor, Ensemble Gradient Boosting Regressor, Ensemble XGBoost. Also, Ridge Regression, Lasso Regression, and Elastic Net Regression has been used to combine the advance regression techniques. Tree-based algorithms generate a decision tree from categorical 'YES' and 'NO' values, Ensemble methods to boosting up the learning and prediction accuracy, Support Vector Machine to extend the model for both classification and regression approach and lastly advance linear regression to predict the house price with different features values.

[1]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[2]  Charles F. F. Karney Sampling Exactly from the Normal Distribution , 2013, ACM Trans. Math. Softw..

[3]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[4]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[5]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[6]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[7]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[8]  Jae Kwon Bae,et al.  Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data , 2015, Expert Syst. Appl..

[10]  Ferath Kherif,et al.  Multiple Linear Regression: Bayesian Inference for Distributed and Big Data in the Medical Informatics Platform of the Human Brain Project , 2018 .

[11]  Theodoros Iliou,et al.  A Novel Machine Learning Data Preprocessing Method for Enhancing Classification Algorithms Performance , 2015, EANN '15.

[12]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[13]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[14]  Riyaz Sikora,et al.  A Modified Stacking Ensemble Machine Learning Algorithm Using Genetic Algorithms , 2014, Journal of International Technology and Information Management.

[15]  Sethuraman Panchanathan,et al.  Coupled Support Vector Machines for Supervised Domain Adaptation , 2015, ACM Multimedia.

[16]  Pat Langley,et al.  Reducing overfitting in process model induction , 2005, ICML '05.

[17]  Charu C. Aggarwal,et al.  Randomized Feature Engineering as a Fast and Accurate Alternative to Kernel Methods , 2017, KDD.

[18]  David McSherry,et al.  Strategic induction of decision trees , 1999, Knowl. Based Syst..

[19]  Charu C. Aggarwal,et al.  Similarity Forests , 2017, KDD.

[20]  Aki Vehtari,et al.  User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction , 2017, IUI.

[21]  Honglak Lee,et al.  Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.

[22]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[23]  Gavin C. Cawley,et al.  Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters , 2007, J. Mach. Learn. Res..

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  Jingyi Mu,et al.  Housing Value Forecasting Based on Machine Learning Methods , 2014 .

[27]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[28]  Zhihe Li,et al.  Evaluating forecasting algorithm of realistic datasets based on machine learning , 2018, ICIAI '18.

[29]  Hongmei Chi,et al.  Investigation of Florida Housing Prices using Predictive Time Series Model , 2018, PEARC.

[30]  Xiaofeng Zhong,et al.  House Prices Prediction with Machine Learning Algorithms , 2018, ICMLC.

[31]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[32]  Buyang Cao,et al.  Research on Ensemble Learning-based Housing Price Prediction Model , 2018 .