Clustering Analysis of Brain Protein Expression Levels in Trisomic and Control Mice

In this paper, we describe a clustering analysis on 77 distinct brain protein expression levels of trisomic and control mice. Hierarchical clustering based on Euclidean distance results in clusters that partially coincide with experimental treatment groups of mice, as shown in dendrogram results. Normalization results in decreased within- and between-cluster sum of squares and a decreased ratio of between- to within-cluster sum of squares. The optimal number of clusters ranges from 1 to 4 clusters as determined by the gap statistic method or direct methods of the silhouette width or the elbow of total within-cluster sum of squares. Principal components analysis shows separation of clustered groups generated by k-means clustering. When clustered groups are plotted against the first two principal components, more distinct clusters are generated after z-score normalization of protein expression levels, compared to non-normalized results.