Clusters and factors: neural algorithms for a novel representation of huge and highly multidimensional data sets

A two-level representation is proposed for huge and highly dimensional data sets: 1) a global and synthetic mapping of the topics issued from the data, and 2) a set of local axes, one per topic, ranking both the descriptors and the described objects. Two algorithms are presented for deriving these axes: the axial k-means results in strict clusters, each one being characterized with an ”axoid”, or first component of a simplified ”spherical” factor analysis applied to this cluster. The local components analysis results in fuzzy, overlapping clusters, issued from the local maxima of a ”partial inertia” landscape, and which constitute an absolute optimum. Interesting properties of these methods are presented and argued: graded, progressive type of representation connected to human categorization schemes; distributional equivalence in the space of the objects; stable local representations; computer efficiency.