Perform colocalization analysis of trans-eQTL loci and GWAS trait loci.
A coloc region is defined as a flanking region (100kb) centered at a trans-eQTL signal.
Rscript coloc/1_qtl_reg.R
Let’s take a look at the defined coloc regions,
file_qtl_reg <- '/project2/xuanyao/llw/coloc/ukbb_coloc_blood_traits/data/qtlColocReg.txt.gz'
qtl_reg <- fread(file_qtl_reg)
distinct(qtl_reg, Region) %>% str()
## Classes 'data.table' and 'data.frame': 2364 obs. of 1 variable:
## $ Region: chr "module1:4:6697673" "module1:3:917444" "module1:11:96178769" "module1:2:225599258" ...
There are 2364 defined regions in total. Each region is defined as the target module and lead SNP with smallest p-value in that region.
I considered a few GWAS traits, including blood related traits,
file_pheno_manifest <- '/project2/xuanyao/llw/coloc/ukbb_coloc_blood_traits/ukbb_blood_traits.csv'
pheno_manifest <- fread(file_pheno_manifest)
knitr::kable(pheno_manifest)
| GWAS Group | GWAS ID | GWAS Trait | Trait Abbreviation |
|---|---|---|---|
| Platelets | 30080 | Platelet count | PLT |
| Platelets | 30090 | Platelet crit | PLTC |
| Platelets | 30100 | Mean platelet volume | MPV |
| Platelets | 30110 | Platelet distribution width | PDW |
| Red blood cells | 30010 | Red blood cell count | RBC |
| Red blood cells | 30020 | Haemoglobin concentration | HGB |
| Red blood cells | 30030 | Haematocrit percentage | HCT |
| Red blood cells | 30040 | Mean corpuscular volume | MCV |
| Red blood cells | 30050 | Mean corpuscular haemoglobin | MCH |
| Red blood cells | 30060 | Mean corpuscular haemoglobin concentration | MCHC |
| Red blood cells | 30070 | Red blood cell distribution width | RBC_W |
| Red blood cells | 30270 | Mean sphered cell volume | MSCV |
| Red blood cells | 30240 | Reticulocyte percentage | RET_P |
| Red blood cells | 30250 | Reticulocyte count | RET |
| Red blood cells | 30260 | Mean reticulocyte volume | MRV |
| Red blood cells | 30280 | Immature reticulocyte fraction | IRF |
| Red blood cells | 30290 | High light scatter reticulocyte percentage | HLSR_P |
| Red blood cells | 30300 | High light scatter reticulocyte count | HLSR |
| White blood cells | 30000 | White blood cell count | WBC |
| White blood cells | 30120 | Lymphocyte count | LYMPH |
| White blood cells | 30130 | Monocyte count | MONO |
| White blood cells | 30140 | Neutrophill count | NEUT |
| White blood cells | 30150 | Eosinophill count | EO |
| White blood cells | 30160 | Basophill count | BASO |
| White blood cells | 30180 | Lymphocyte percentage | LYMPH_P |
| White blood cells | 30190 | Monocyte percentage | MONO_P |
| White blood cells | 30200 | Neutrophill percentage | NEUT_P |
| White blood cells | 30210 | Eosinophill percentage | EO_P |
| White blood cells | 30220 | Basophill percentage | BASO_P |
autoimmune diseases,
ls /project2/xuanyao/llw/coloc/immune_traits | grep -v 'all'
## pmid24390342_RA_GWASmeta_European
## pmid26192919_CD
## pmid26192919_IBD
## pmid26192919_UC
## pmid26502338_sle
## pmid28067908_cd
## pmid28067908_ibd
## pmid28067908_uc
## pmid29083406_Allergy
## pmid29892013_AE
## pmid30929738_ASTHMA
## pmid31604244_MS
and also a few more other traits,
file_pheno_manifest <- "/project2/xuanyao/llw/coloc/ukbb_coloc_more_traits/phenotype_manifest_sub.tsv"
pheno_manifest <- fread(file_pheno_manifest)
knitr::kable(select(pheno_manifest, phenocode_uniq, trait))
| phenocode_uniq | trait |
|---|---|
| 21001 | Body-mass-index-(BMI) |
| 23104 | Body-mass-index-(BMI) |
| 30690 | Cholesterol |
| 30760 | HDL-cholesterol |
| 6152_9 | Hayfever,-allergic-rhinitis-or-eczema |
| 401 | Hypertension |
| 3761 | Age-hay-fever,-rhinitis-or-eczema-diagnosed |
| 20126_0 | No-Bipolar-or-Depression |
| 6152_8 | Asthma |
| 495 | Asthma |
| 411.4 | Coronary-atherosclerosis |
| 250.2 | Type-2-diabetes |
| 172 | Skin-cancer |
| 278 | Overweight,-obesity-and-other-hyperalimentation |
| 174 | Breast-cancer |
| 300 | Anxiety-disorders |
| 185 | Cancer-of-prostate |
| 714 | Rheumatoid-arthritis-and-other-inflammatory-polyarthropathies |
| 714.1 | Rheumatoid-arthritis |
| 555 | Inflammatory-bowel-disease-and-other-gastroenteritis-and-colitis |
| 555.2 | Ulcerative-colitis |
| 332 | Parkinson’s-disease |
| 335 | Multiple-sclerosis |
| 290.11 | Alzheimer’s-disease |
| 695.42 | Systemic-lupus-erythematosus |
| 50 | Standing-height |
Lastly, based on the previously defined trans-eQTL loci, GWAS loci at the same location are defined.
Rscript 3_1_gwas_reg_ukbb.R
Rscript 3_2_gwas_reg_immu.R
For a coloc loci of a pair of (trans-eQTL loci, GWAS loci), I performed coloc using coloc package.
Rscript 4_1_coloc_ukbb.R
Rscript 4_2_coloc_immu.R
Let’s take a look at what the coloc results look like. Take trait 30010 (Red blood cell count) for an example.
file_res_coloc <- '/project2/xuanyao/llw/coloc/ukbb_coloc_blood_traits/data/pheno30010.resColoc.txt.gz'
res_coloc <- fread(file_res_coloc)
knitr::kable(head(res_coloc))
| Phenocode | trait | Region | Pval | nsnps | PP.H0.abf | PP.H1.abf | PP.H2.abf | PP.H3.abf | PP.H4.abf |
|---|---|---|---|---|---|---|---|---|---|
| 30010 | RBC | module4:1:248039451 | 0e+00 | 376 | 0 | 0e+00 | 0.0000001 | 0.00e+00 | 0.9999999 |
| 30010 | RBC | module25:3:101011537 | 0e+00 | 123 | 0 | 5e-07 | 0.0000000 | 1.00e-06 | 0.9999985 |
| 30010 | RBC | module66:12:54685880 | 0e+00 | 76 | 0 | 0e+00 | 0.0000657 | 2.00e-07 | 0.9999341 |
| 30010 | RBC | module127:12:54685880 | 0e+00 | 76 | 0 | 0e+00 | 0.0000723 | 7.00e-07 | 0.9999271 |
| 30010 | RBC | module15:7:50426097 | 0e+00 | 135 | 0 | 0e+00 | 0.0000000 | 9.35e-05 | 0.9999065 |
| 30010 | RBC | module48:12:54685880 | 2e-07 | 76 | 0 | 0e+00 | 0.0007935 | 2.80e-06 | 0.9992038 |
For example, for the first coloc region module4:1:248039451, it indicates module 4 has shared coloc signal with trait RBC near loci 1:248039451 (PP4>0.99).
Take the coloc region module4:1:248039451 as an example, to look at the associations near coloc regions.
library(locuscomparer)
figTitle <- "Region: module4:1:248039451; #SNPs: 376; PP.H4.abf: 1"
file_gwasVis <- "/project2/xuanyao/llw/coloc/ukbb_coloc_blood_traits/colocVisData/module4:1:248039451.gwas.reg.coloc.txt"
file_qtlVis <- "/project2/xuanyao/llw/coloc/ukbb_coloc_blood_traits/colocVisData/module4:1:248039451.qtl.reg.coloc.txt"
base_plt <- locuscompare(in_fn1 = file_gwasVis, in_fn2 = file_qtlVis,
title1 = 'GWAS', title2 = 'QTL',
population = "EUR", genome = 'hg19')
base_plt +
labs(title = figTitle) +
theme(plot.title = element_text(hjust = 0.5, size = 12))
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-conda-linux-gnu (64-bit)
## Running under: Scientific Linux 7.4 (Nitrogen)
##
## Matrix products: default
## BLAS/LAPACK: /scratch/midway2/liliw1/conda_env/rstudio-server/lib/libopenblasp-r0.3.18.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] locuscomparer_1.0.0 data.table_1.14.2 forcats_0.5.1
## [4] stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
## [7] readr_2.1.2 tidyr_1.2.0 tibble_3.1.7
## [10] ggplot2_3.3.6 tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] ggrepel_0.9.1 Rcpp_1.0.8.3 lubridate_1.8.0 assertthat_0.2.1
## [5] digest_0.6.29 utf8_1.2.2 R6_2.5.1 cellranger_1.1.0
## [9] backports_1.4.1 reprex_2.0.1 evaluate_0.14 httr_1.4.2
## [13] highr_0.9 pillar_1.7.0 rlang_1.0.3 readxl_1.3.1
## [17] rstudioapi_0.13 jquerylib_0.1.4 R.utils_2.11.0 R.oo_1.24.0
## [21] rmarkdown_2.11 labeling_0.4.2 RMySQL_0.10.23 bit_4.0.4
## [25] munsell_0.5.0 broom_0.7.12 compiler_4.1.2 modelr_0.1.8
## [29] xfun_0.29 pkgconfig_2.0.3 htmltools_0.5.2 tidyselect_1.1.1
## [33] fansi_1.0.3 crayon_1.5.1 tzdb_0.2.0 dbplyr_2.1.1
## [37] withr_2.5.0 R.methodsS3_1.8.1 grid_4.1.2 jsonlite_1.7.3
## [41] gtable_0.3.0 lifecycle_1.0.1 DBI_1.1.3 magrittr_2.0.3
## [45] scales_1.2.0 cli_3.3.0 stringi_1.7.6 farver_2.1.1
## [49] fs_1.5.2 xml2_1.3.3 bslib_0.3.1 ellipsis_0.3.2
## [53] generics_0.1.2 vctrs_0.4.1 cowplot_1.1.1 tools_4.1.2
## [57] bit64_4.0.5 glue_1.6.2 hms_1.1.1 fastmap_1.1.0
## [61] yaml_2.2.2 colorspace_2.0-3 rvest_1.0.2 knitr_1.37
## [65] haven_2.4.3 sass_0.4.0