Towards a syntactic signature for domain models: proposed descriptive metrics for visualizing the entity fan-out frequency distribution

The main objective of this paper is to find a minimal set of measures that allow the immediate, intuitive characterisation and visualization of the syntactic structure of models covering a particular application domain. The measures are validated against a test bed of twenty-two generic enterprise models. Traditional system engineering metrics were not very useful in characterizing or differentiating the different models. Instead, it was found that the frequency distribution of the entity fan-outs for each model provided a distinct model signature. Although the characteristics of these distributions are visually immediately apparent, traditional descriptive statistics for frequency distributions fail to capture the essential shape of fan-out distributions. This is due to their extreme skewness and the presence of extreme outliers. This paper proposes four summary statistics to describe the fan-out distributions, thus providing a compact and intuitive signature for a model: one categorical classification ('waves' and 'slides'), and three numeric metrics namely the harmonic mean fan-out (to replace the arithmetic mean), a smoothness/bumpiness value and a shape curvature coefficient.