Data Clustering Technique Using Data Envelopment Analysis

Data Envelopment Analysis (DEA), which is a multifactor productivity measurement tool, is generally used in assessing the relative efficiency of homogenous units and setting benchmark for inefficient units. When large data set is analyzed, there are a few efficient samples and vast amount of inefficient units. It is generally observed that some efficient samples have no reference set from inefficient units while some efficient samples have multiple references. Hence this feature makes it difficult to classify units correctly. To overcome this flaw, we propose an r-DEA (recursive DEA). In every step of r-DEA, we exclude the efficient units which are referenced by inefficient units, and apply DEA again to the remaining units. The procedure is repeated until all efficient units are referenced. As a result the whole samples are classified into clusters and, each cluster is specified some typical efficient samples.