Discrimination through the Regularized Nearest Cluster Method

This paper contains three parts. The first part consists of a brief review of the discrimination techniques used when dealing with large arrays of sparse qualitative data. The second part presents the “Regularized Nearest Cluster Method”, an efficient and versatile technique of discrimination, well adapted to this kind of data. This technique is compared to some other existing methods likely to be used in similar contexts. The third part briefly discusses the interest of these methods in the domain of textual data analysis.