Kernel Regularization and Dimension Reduction

It is often possible to use expert knowledge or other sources of information to obtain dissimilarity measures for pairs of o bjects, which serve as pseudo-distances between the objects. When d issimilarity information is available as the data, there are t wo different types of problems of interest. The first is to estimate full position configuration for all objects in a low dimensional s pace while respecting the dissimilarity information. This is us ually for the purposes of visualizing the data and/or conducting furt her statistical analysis, such as clustering or classification. Mu ltidimensional Scaling (MDS), which is still an active research area , has been traditionally used to tackle this problem. In the secon d type of problems, the high dimensional data points are assumed to lie on a low dimensional manifold and the goal is to unfold the man ifold in order to recover the underlying intrinsic low dimen sional structure. We provide a novel, unified framework called Kernel Regularization to optimally solve both types of problems. Advan ced optimization techniques are utilized to obtain the global s olutions accurately and efficiently. The proposed method can na turally accommodate the dissimilarity information with pos sibly crude, noisy, incomplete, inconsistent and weighted obser vations. Various favorable operating characteristics and properti es of the method are illustrated using both simulated and real data se ts. 1 Dissimilarity Information and Regularized Kernel Estimate Given a set ofN objects, suppose we have obtained a measure of dissimilarity,dij , for certain object pairs(i, j). We introduce the class of Regularized Kernel Estimates (RKEs), which we defin e as solutions to optimization problems of the following form: