Pipeline from PLINK to ADABF GxE polygenic analysis

##########################################################################################
# If you use this code to analyze data, please cite the following paper:
# Lin W-Y*, Huang C-C, Liu Y-L, Tsai S-J, Kuo P-H (2018). Polygenic approaches to detect gene-environment interactions when external information is unavailable. Briefings in Bioinformatics, in press.
# Any questions or comments, please contact: Wan-Yu Lin, linwy@ntu.edu.tw, Institute of Epidemiology and Preventive Medicine, National Taiwan University College of Public Health
# Thank you.
##########################################################################################

Suppose we have "TWBGWAS.bim", "TWBGWAS.bed", and "TWBGWAS.fam". If the phenotype ("DIASTOLIC"), environmental factor ("Smoking"), and covariates ("SEX,AGE,BMI,PC1,PC2,PC3,PC4,PC5,PC6,PC7") are put in "YECov".
Please note that the row ordering of "YECov" should be consistent with that of "TWBGWAS.fam"

Step 1, the pruning step

plink --bfile TWBGWAS --chr 1-22 --indep 50 5 2 --noweb

plink --bfile TWBGWAS --extract plink.prune.in --make-bed --out prunedata --noweb

Step 2, the screening step

plink --bfile prunedata --no-pheno --linear --pheno YECov --pheno-name DIASTOLIC --covar YECov --covar-name SEX,AGE,BMI,PC1,PC2,PC3,PC4,PC5,PC6,PC7 --ci 0.95 --hide-covar --adjust --out DIAscreening --noweb

The following are commands implemented in R:

DIAscreening <- read.table("DIAscreening.assoc.linear", header=T)

write.table(DIAscreening$SNP[which(DIAscreening$P<0.05)], 'DIAscreening', row.names=FALSE, col.names=F, quote=FALSE, na='-9', append=F)

system("plink --bfile prunedata --extract DIAscreening --make-bed --out DIAscreening --noweb")

system("plink --bfile DIAscreening --recodeA --out DIAscreening --noweb")

SNP <- read.table("DIAscreening.raw", header=T)[,-c(1:6)]

YECov <- read.table("YECov", header=T, na.strings="-9")

Covariate <- cbind(YECov$AGE, YECov$SEX, YECov$BMI, YECov$PC1, YECov$PC2, YECov$PC3, YECov$PC4, YECov$PC5, YECov$PC6, YECov$PC7)

source("ADABFGEPoly.R")

ADABFGE(Y=YECov$DIASTOLIC, Copy=SNP, E=YECov$Smoking, Y.Type="C", E.Type="D", Cov=Covariate, Sig=0.05, FDR.level=0.20, Precision.P=1)

Thanks for your interest.

Return to the ADABF GxE polygenic method

Return to Wan-Yu Lin's homepage