Pipeline from PLINK to ADABF GxE polygenic analysis
##########################################################################################
# If you use this code to analyze data, please cite the following paper:
# Lin W-Y*, Huang C-C, Liu Y-L, Tsai S-J, Kuo P-H (2018). Polygenic approaches to detect gene-environment interactions when external information is unavailable. Briefings in Bioinformatics, in press.
# Any questions or comments, please contact: Wan-Yu Lin,
linwy@ntu.edu.tw, Institute of Epidemiology and Preventive Medicine,
National Taiwan University College of Public Health
# Thank you.
##########################################################################################
Suppose we have "TWBGWAS.bim", "TWBGWAS.bed", and "TWBGWAS.fam". If the phenotype ("DIASTOLIC"), environmental factor ("Smoking"), and covariates ("SEX,AGE,BMI,PC1,PC2,PC3,PC4,PC5,PC6,PC7") are put in "YECov".
Please note that the row ordering of "YECov" should be consistent with that of "TWBGWAS.fam"
Step 1, the pruning step
plink --bfile TWBGWAS --chr 1-22 --indep 50 5 2 --noweb
plink --bfile TWBGWAS --extract plink.prune.in --make-bed --out prunedata --noweb
Step 2, the screening step
plink
--bfile prunedata --no-pheno --linear --pheno YECov
--pheno-name DIASTOLIC --covar YECov --covar-name
SEX,AGE,BMI,PC1,PC2,PC3,PC4,PC5,PC6,PC7 --ci 0.95 --hide-covar --adjust
--out DIAscreening --noweb
The following are commands implemented in R:
DIAscreening <- read.table("DIAscreening.assoc.linear", header=T)
write.table(DIAscreening$SNP[which(DIAscreening$P<0.05)],
'DIAscreening', row.names=FALSE, col.names=F, quote=FALSE, na='-9',
append=F)
system("plink --bfile prunedata --extract DIAscreening --make-bed --out DIAscreening --noweb")
system("plink --bfile DIAscreening --recodeA --out DIAscreening --noweb")
SNP <- read.table("DIAscreening.raw", header=T)[,-c(1:6)]
YECov <- read.table("YECov", header=T, na.strings="-9")
Covariate <- cbind(YECov$AGE, YECov$SEX, YECov$BMI, YECov$PC1,
YECov$PC2, YECov$PC3, YECov$PC4, YECov$PC5, YECov$PC6, YECov$PC7)
source("ADABFGEPoly.R")
ADABFGE(Y=YECov$DIASTOLIC,
Copy=SNP, E=YECov$Smoking, Y.Type="C", E.Type="D", Cov=Covariate,
Sig=0.05, FDR.level=0.20, Precision.P=1)
Thanks for your interest.
Return to the ADABF GxE polygenic method
Return to Wan-Yu Lin's homepage