ESTIMATING THE NUMBER OF CANCER SUBTYPES FROM WHOLE-GENOME EXPRESSION DATA VIA A PENALIZED PROBABILISTIC PRINCIPAL COMPONENT ANALYSIS ∗ By