Detection of footprints of selection has been a great research interest in population genetics over the past few years, both in Human and Animal populations. In this work we present two methodological improvements to increase the accuracy of detection of selection signature. First, we show how Principal Components Analysis (PCA) and Between-Groups Analysis (BGA), which are very computationally efficient methods to explore large SNP data sets and to characterize population genetic structures, can provide SNP typological values that are related to F-statistics. In a second step, we propose to use the fused Lasso approach to identify significant footprints of selection, taking into account the spatial organization of the SNPs along the chromosomes. Indeed, previously proposed methods are often based on empirical smoothing approaches and until now no clear recommendation was available for the choice of significant threshold for SNP detection. A simulation study both under a neutral model and under selection was performed to evaluate the performance of the proposed method in terms of detection power and false positive level. As an illustration of the approach, we analyzed human haplotypes sampled from three HapMap populations, and bovine data obtained from a 800K SNP chip.
[1]
Emilie Lebarbier,et al.
Detecting multiple change-points in the mean of Gaussian process by model selection
,
2005,
Signal Process..
[2]
R. Tibshirani,et al.
Sparsity and smoothness via the fused lasso
,
2005
.
[3]
Anne-Béatrice Dufour,et al.
The ade4 Package: Implementing the Duality Diagram for Ecologists
,
2007
.
[4]
N. Meinshausen,et al.
Stability selection
,
2008,
0809.2932.
[5]
Holger Hoefling.
A Path Algorithm for the Fused Lasso Signal Approximator
,
2009,
0910.0526.
[6]
Gregory Ewing,et al.
MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus
,
2010,
Bioinform..
[7]
Qiang Yang,et al.
Identifying disease-associated SNP clusters via contiguous outlier detection
,
2011,
Bioinform..
[8]
M. Gautier,et al.
On the genetic interpretation of Between-Group PCA on SNP data
,
2012
.