**Adaptive
Combination of P-values (ADA) Algorithm
for Case-Control Sequence Data**

**To Pinpoint Rare Causal Variants from A Large Number
of Variants in A Gene**

##########################################################################################

# If you use this code to analyze data, please cite the following three papers:

# Lin W-Y (2016). Beyond rare-variant association testing: Pinpointing rare causal variants in case-control sequencing study.

# Lin W-Y, Lou XY, Gao G, Liu N (2014). Rare Variant Association Testing by Adaptive Combination of P-values. PLoS ONE, 9: e85728. [PMID: 24454922].

# Cheung YH, Wang G, Leal SM, Wang S (2012). A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders. Genetic epidemiology 2012, 36(7):675-685.

# Any question, please contact: Wan-Yu Lin, linwy@ntu.edu.tw, Institute of Epidemiology and Preventive Medicine, National Taiwan University

# Thank you.

# Acknowledgements: This program was developed based on Cheung et al.'s code. Thank them for sharing their R code.

##########################################################################################

The R code to implement the ADA prioritization method Example file: ADAfile

In R, the code to implement this function is:

source("ADAprioritized.r")

ADATest('ADAfile.txt', 0.05, 1000, FALSE, 'additive', TRUE) # wait for ~3 seconds

where 'ADAfile.txt' is the data file, in which the first column is the disease status (1: case; 0: control), and column 2 to the last column are the numbers of minor alleles.

The following input elements of this function are:

mafThr = 0.05, (Exclude SNPs with combined MAF > 0.05)

num_perm = 1000, (the number of permutations)

midp = FALSE, (P-values according to the Fisher's exact test)

mode = 'additive', (mode of inheritance = "additive")

twoSided = TRUE (two-sided test)

Output: The output contains three objects:

(1) pval: the P-value of the ADA test.

(2) optimal.t: the

(3) posit: the variants with per-site

If posit returns 32 and 46, it means that No. 32 and No. 46 are important variants. That is, the variants put in the 33rd and the 47th columns in ADAfile.txt are prioritized. Recall that the first column in ADAfile.txt is the disease status.

Using our test sample "ADAfile.txt", you will obtain the following results:

1. With "twoSided = TRUE", both deleterious variants and protective variants are prioritized.

2. The option is changed to "twoSided = FALSE", and so only deleterious variants are prioritized.

Thanks for your interest in our work.

Return to Wan-Yu Lin's homepage