LMS and Backpropagation are Minimax Filters

An important problem that arises in many applications is the following adaptive problem: given a sequence of n × 1 input column vectors {h i }, and a corresponding sequence of desired scalar responses {d i }, find an estimate of an n × 1 column vector of weights w such that the sum of squared errors, \(\sum\nolimits_{i = 0}^N {{{\left| {{d_i} - h_i^Tw} \right|}^2}}\), is minimized. The {h i ,d i } are most often presented sequentially, and one is therefore required to find an adaptive scheme that recursively updates the estimate of w. The least-mean-squares (LMS) algorithm was originally conceived as an approximate solution to the above adaptive problem. It recursively updates the estimates of the weight vector along the direction of the instantaneous gradient of the sum squared error [1]. The introduction of the LMS adaptive filter in 1960 came as a significant development for a broad range of engineering applications since the LMS adaptive linear-estimation procedure requires essentially no advance knowledge of the signal statistics. The LMS, however, has been long thought to be an approximate minimizing solution to the above squared error criterion, and a rigorous minimization criterion has been missing.