A similarity-based approach to prediction

Abstract Assume we are asked to predict a real-valued variable y t based on certain characteristics x t = ( x t 1 , … , x t d ) , and on a database consisting of ( x i 1 , … , x i d , y i ) for i = 1 , … , n . Analogical reasoning suggests to combine past observations of x and y with the current values of x to generate an assessment of y by similarity-weighted averaging. Specifically, the predicted value of y , y t s , is the weighted average of all previously observed values y i , where the weight of y i , for every i = 1 , … , n , is the similarity between the vector x t 1 , … , x t d , associated with y t , and the previously observed vector, x i 1 , … , x i d . The “empirical similarity” approach suggests estimation of the similarity function from past data. We discuss this approach as a statistical method of prediction, study its relationship to the statistical literature, and extend it to the estimation of probabilities and of density functions.

[1]  Christopher K. Riesbeck,et al.  Inside Case-Based Reasoning , 1989 .

[2]  Dov Samet,et al.  Probabilities as Similarity-Weighted Frequencies , 2004 .

[3]  Itzhak Gilboa,et al.  Axiomatization of an Exponential Similarity Function , 2004, Math. Soc. Sci..

[4]  D. Chant,et al.  On asymptotic tests of composite hypotheses in nonstandard conditions , 1974 .

[5]  L. Anselin Spatial Econometrics: Methods and Models , 1988 .

[6]  Roger C. Schank,et al.  Explanation Patterns: Understanding Mechanically and Creatively , 1986 .

[7]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[8]  Itzhak Gilboa,et al.  A theory of case-based decisions , 2001 .

[9]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[10]  H. Akaike An approximation to the density function , 1954 .

[11]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[12]  I. Gilboa,et al.  Inductive Inference: An Axiomatic Approach , 2001 .

[13]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[14]  Itzhak Gilboa,et al.  Rule-Based and Case-Based Reasoning in Housing Prices , 2004 .

[15]  I. Gilboa,et al.  Case-Based Decision Theory , 1995 .

[16]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[17]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[18]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[19]  Lijian Yang,et al.  Multivariate bandwidth selection for local linear regression , 1999 .

[20]  Offer Lieberman ASYMPTOTIC THEORY FOR EMPIRICAL SIMILARITY MODELS , 2009, Econometric Theory.

[21]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[22]  Itzhak Gilboa,et al.  Empirical Similarity , 2004, The Review of Economics and Statistics.