Pipeline from PLINK to ADABF GxE polygenic analysis

##########################################################################################
# If you use this code to analyze data, please cite the following paper: 
# Lin W-Y*, Huang C-C, Liu Y-L, Tsai S-J, Kuo P-H (2018). Polygenic approaches to detect gene-environment interactions when external information is unavailable. Briefings in Bioinformatics, in press.
# Any questions or comments, please contact: Wan-Yu Lin, linwy@ntu.edu.tw, Institute of Epidemiology and Preventive Medicine, National Taiwan University College of Public Health
# Thank you.
##########################################################################################

Suppose we have "TWBGWAS.bim", "TWBGWAS.bed", and "TWBGWAS.fam". If the phenotype ("DIASTOLIC"), environmental factor ("Smoking"), and covariates ("SEX,AGE,BMI,PC1,PC2,PC3,PC4,PC5,PC6,PC7") are put in "YECov".
Please note that the row ordering of "
YECov" should be consistent with that of "TWBGWAS.fam"

Step 1, the pruning step

plink --bfile TWBGWAS --chr 1-22 --indep 50 5 2
--noweb

plink --bfile TWBGWAS --extract plink.prune.in --make-bed --out prunedata
--noweb

Step 2, the screening step

plink --bfile prunedata --no-pheno --linear --pheno YECov --pheno-name DIASTOLIC --covar YECov --covar-name SEX,AGE,BMI,PC1,PC2,PC3,PC4,PC5,PC6,PC7 --ci 0.95 --hide-covar --adjust --out DIAscreening --noweb    

The following are commands implemented in R:

DIAscreening <- read.table("DIAscreening.assoc.linear", header=T)

write.table(DIAscreening$SNP[which(DIAscreening$P<0.05)], 'DIAscreening', row.names=FALSE, col.names=F, quote=FALSE, na='-9', append=F)

system("plink --bfile prunedata --extract DIAscreening --make-bed --out DIAscreening --noweb")

system("plink --bfile DIAscreening --recodeA --out DIAscreening --noweb")

SNP <- read.table("DIAscreening.raw", header=T)[,-c(1:6)]

YECov <- read.table("YECov", header=T, na.strings="-9")     


Covariate <- cbind(YECov$AGE, YECov$SEX, YECov$BMI, YECov$PC1, YECov$PC2, YECov$PC3, YECov$PC4, YECov$PC5, YECov$PC6, YECov$PC7) 

source("ADABFGEPoly.R")

ADABFGE(Y=YECov$DIASTOLIC, Copy=SNP, E=YECov$Smoking, Y.Type="C", E.Type="D", Cov=Covariate, Sig=0.05, FDR.level=0.20, Precision.P=1)





Thanks for your interest.


Return to the ADABF GxE polygenic method

Return to Wan-Yu Lin's homepage