Least median of squares regression and minimum volume ellipsoid estimator for outliers detection in housing appraisal

In the real estate sector, regression analysis is the most used method for interpretative and predictive purposes. However, the presence of outliers in the estimative sample can lead to ordinary last squared regression models that do not represent the investigated market phenomenon, with the consequence of producing unreliable assessments. In the present research, the issues of the identification and the removal of outliers are discussed. The outliers identified by the least median of squares (LMS) regression and the minimum volume ellipsoid (MVE) estimator are compared in order to test the coincidence or the diversity. A complete diagnosis of the data of the initial estimative sample is carried out, combining the robust residuals obtained with LMS and the robust distances obtained with MVE. The data are classified into regular observations, vertical outliers, good leverage points and bad leverage points, and cases to delete and those to keep in the sample are identified.

[1]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[2]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[3]  Michael J. Piovoso,et al.  A method of robust multivariate outlier replacement , 2002 .

[4]  F. J. Anscombe,et al.  Rejection of Outliers , 1960 .

[5]  A. Siegel Robust regression using repeated medians , 1982 .

[6]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[7]  Lamine Mili,et al.  Robust Estimation Theory for Bad Data Diagnostics in Electric Power Systems , 1990 .

[8]  Y Van de Peer,et al.  Comparative analysis of more than 3000 sequences reveals the existence of two pseudoknots in area V4 of eukaryotic small subunit ribosomal RNA. , 2000, Nucleic acids research.

[9]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[10]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[11]  Desire L. Massart,et al.  Methods for outlier detection in prediction , 2002 .

[12]  C.Y. Jim,et al.  Impacts of urban environmental elements on residential housing prices in Guangzhou (China) , 2006 .

[13]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[14]  Antanas Verikas,et al.  TRACKING OF DOUBTFUL REAL ESTATE TRANSACTIONS BY OUTLIER DETECTION METHODS: A COMPARATIVE STUDY , 2015 .

[15]  F. Hampel The Influence Curve and Its Role in Robust Estimation , 1974 .

[16]  Allen C. Goodman,et al.  Housing market segmentation and hedonic prediction accuracy , 2003 .

[17]  D. W. Zimmerman A Note on the Influence of Outliers on Parametric and Nonparametric Tests , 1994 .

[18]  E. Acuña,et al.  A Meta analysis study of outlier detection methods in classification , 2004 .

[19]  David L. Woodruff,et al.  Computation of robust estimates of multivariate location and shape , 1993 .

[20]  F. Hampel A General Qualitative Definition of Robustness , 1971 .

[21]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[22]  P. L. Davies,et al.  The asymptotics of Rousseeuw's minimum volume ellipsoid estimator , 1992 .

[23]  S. Morgan,et al.  Outlier detection in multivariate analytical chemical data. , 1998, Analytical chemistry.

[24]  Steven J. Schwager,et al.  Detection of Multivariate Normal Outliers , 1982 .

[25]  R. Mendelsohn,et al.  The choice of functional forms for hedonic price equations: Comment , 1985 .

[26]  C. Zhang,et al.  Multivariate outlier detection and remediation in geochemical databases. , 2001, The Science of the total environment.

[27]  Mia Hubert,et al.  Recent developments in PROGRESS , 1997 .

[28]  L. Devroye,et al.  Nonparametric density estimation : the L[1] view , 1987 .

[29]  Michele G. Jarrell,et al.  A Comparison of Two Procedures, the Mahalanobis Distance and the Andrews-Pregibon Statistic, for Identifying Multivariate Outliers. , 1992 .

[30]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[31]  D. Titterington Estimation of Correlation Coefficients by Ellipsoidal Trimming , 1978 .

[32]  Temesgen Zewotir,et al.  Influence Diagnostics for Linear Mixed Models , 2005 .

[33]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[34]  Gianluigi De Mare,et al.  LMS for Outliers Detection in the Analysis of a Real Estate Segment of Bari , 2013, ICCSA.

[35]  Hasan Selim,et al.  Determinants of house prices in Turkey: Hedonic regression versus artificial neural network , 2009, Expert Syst. Appl..

[36]  Patrizia Semeraro,et al.  The impact of house characteristics on the bargaining outcome , 2013 .

[37]  John D. Benjamin,et al.  Determining Apartment Rent: The Value of Amenities, Services and External Factors , 2009 .

[38]  Luc Devroye,et al.  Nonparametric Density Estimation , 1985 .

[39]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[40]  Johanna Smeyers-Verbeke,et al.  Robust regression and outlier detection in the evaluation of robustness tests with different experimental designs , 2002 .

[41]  Nghiep Nguyen,et al.  Predicting Housing Value: A Comparison of Multiple Regression Analysis and Artificial Neural Networks , 2001 .

[42]  A. B. Morancho A hedonic valuation of urban green areas , 2003 .

[43]  L. Devroye,et al.  Nonparametric Density Estimation: The L 1 View. , 1985 .

[44]  V. Smith,et al.  Can Markets Value Air Quality? A Meta-Analysis of Hedonic Property Value Models , 1995, Journal of Political Economy.

[45]  Martin Hoesli,et al.  A hedonic investigation of the rental value of apartments in central Bordeaux , 1997 .

[46]  David L. Woodruff,et al.  Identification of Outliers in Multivariate Data , 1996 .

[47]  Javier Ruiz-Castillo,et al.  Robust Methods of Building Regression Models-An Application to the Housing Sector , 1984 .

[48]  Peter J. Rousseeuw PROGRESS: a program for robust regression , 1988 .

[49]  Beniamino Murgante,et al.  Urban Residential Land Value Analysis: The Case of Potenza , 2013, ICCSA.

[50]  Sheppard,et al.  [Handbook of Regional and Urban Economics] Applied Urban Economics Volume 3 || Chapter 41 Hedonic analysis of housing markets , 1999 .

[51]  Marius Thériault,et al.  Size and proximity effects of primary schools on surrounding house values , 2001 .

[52]  M. Koetse,et al.  The value of urban open space: meta-analyses of contingent valuation and hedonic pricing results. , 2011, Journal of environmental management.

[53]  Christian Janssen,et al.  Robust estimation of hedonic models of price and income for investment property , 2001 .

[54]  Frank Hampel,et al.  Robust statistics: a brief introduction and overview , 2001 .

[55]  Aysegul Can Specification and estimation of hedonic housing price models , 1992 .

[56]  James A. Thorson The use of least median of squares in the estimation of land value equations , 1994 .

[57]  Leo H. Chiang,et al.  Exploring process data with the use of robust outlier detection algorithms , 2003 .

[58]  Francisco J. Prieto,et al.  Multivariate Outlier Detection and Robust Covariance Matrix Estimation , 2001, Technometrics.

[59]  Peter Rousseeuw,et al.  The Competitive Advantage of Seaports , 2000 .

[60]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[61]  Seung-Hoon Yoo,et al.  A robust estimation of hedonic price models: least absolute deviations estimation , 2001 .

[62]  G. Vining,et al.  Data Analysis: A Model-Comparison Approach , 1989 .

[63]  Stephen Sheppard,et al.  Hedonic analysis of housing markets , 1998 .

[64]  F. Hampel The Breakdown Points of the Mean Combined With Some Rejection Rules , 1985 .

[65]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[66]  Desire L. Massart,et al.  MULTIPLE OUTLIER DETECTION REVISITED , 1998 .