Agent selection for regression on attribute distributed data

This paper introduces a modeling framework for multivariate regression with agents observing attribute-distributed data, coordinated by a fusion center. Under this model, a prototype algorithm resembling the L2 boosting algorithm can effectively minimize the training error, yet it suffers from over-training and slow convergence. A thorough comparison among the agents can speed up convergence of training error and eliminate irrelevant variables, yet it imposes a high demand for data transmission. In this paper, an intelligent agent selection algorithm (based on heuristic functions) is proposed to speed up convergence at low cost of data transmission. The new algorithm can achieve an ensemble estimator of better generalization error with less communication, which is verified by simulation on artificial and real data sets.