Testing for associations between systolic blood pressure and single-nucleotide polymorphism profiles obtained from sparse principal component analysis

Background: Hypertension is a prevalent condition linked to major cardiovascular conditions and multiple other comorbidities. Genetic information can offer a deeper understanding about susceptibility and the underlying disease mechanisms. The Genetic Analysis Workshop 18 (GAW18) provides abundant genotype data to determine genetic associations for being hypertensive and for the underlying trait of systolic blood pressure (SBP). The high-dimensional nature of this data promotes dimension reduction techniques to remove excess noise and also synthesize genetic information for complex, polygenic traits. Methods: For both measured and simulated phenotype data from GAW18, we use sparse principal component analysis to obtain sparse genetic profiles that represent the underlying data structures. We then detect associations between the obtained sparse principal components (PCs) and SBP, a major indicator of hypertension, following up by investigating the sparse PCs for genetic structure to gain insight into new patterns. Results: After adjusting for multiple testing, 27 of 122 PCs were significantly associated with measured SBP, offering a large number of components to investigate. Considering the top 3 PCs, linked genetic regions have been identified; these may act in unison while associated with SBP. Simulated data offered similar results. Conclusions: Sparse PCs can offer a new data-driven approach to structuring genotype data and understanding the genetic mechanics behind complex, polygenic traits such as hypertension.