We computed in-sample dosage-based LD matrices and scores for each of six ancestry group in UKBB. LD matrices are available in Hail's BlockMatrix format on Amazon AWS (see details here). LD scores are available in LDSC-compatible flat files (
.M_5_50) here. For large-scale analysis, you can also find a full LD score Hail Table (not restricted to the HapMap3 variants) on Amazon AWS (see details here)
- The dosage-based genotype matrix was column-wise mean-centered and normalized.
- We applied the same variant QC filter used for the Pan-UKB GWAS (INFO > 0.8, MAC > 20 in each population; see details here)
- For covariate correction, the residuals from the regression of were obtained via where , the residual-maker matrix, and is the matrix of covariates.
- We used the same covariates used for the Pan-UKB GWAS, namely , , , , , and the first 10 PCs of the genotype matrix (see details here).
- We then computed LD matrix via with a radius of 10 Mb. Each element of represents the Pearson correlation coefficient of genotypes between variant and .
- For X-chromosome, we computed a LD matrix jointly using both males and females where male genotypes are coded 0/1 and female genotypes are coded 0/1/2.
- To account for an upward bias of the standard estimator of the Pearson correlation coefficient, we applied a bias adjustment for using .
- LD scores for variant were subsequently computed via with a radius of 1 MB.
- For LDSC-compatible flat files, we only exported LD scores of high-quality HapMap 3 variants that are 1) in autosomes, 2) not in the MHC region, 3) biallelic SNPs, 4) with INFO > 0.9, and 5) MAF > 1% in UKB and gnomAD genome/exome (if available).
- We note that, since we applied covariate adjustment above, these LD scores are equivalent to the covariate-adjusted LD scores as described in Luo, Y. & Li, X. et al., 2020