Simulations for statistical testing

UiT Tromsø, 21 June 2016

Quick review about hypothesis testing:
- p values? power?
- detour to the replicability crisis
Rationale behind simulations for hypothesis testing
- when are simulations worth bothering with?
Permutations for estimating p values in case/control testing
- a useful example
Simulations for estimating empirical power in case/control testing
- current methods
- weighted permutations
Application to multiple testing
- controlling the FWER with permutations

What this presentation is not about

Simulation methods are widely used in statistics, here the focus is on simulations for computing p values and power
For instance, we won't talk about MCMC sampling methods
If you wonder how to sample standard distributions in R, remember the syntax is r + R name of ditribution

runif(1) #draws one value according to the uniform ditribution in [0,1]

## [1] 0.05620097

rnorm(2,mean=0,sd=2) #two values, gaussian distribution N(0,2^2)

## [1]  0.8306065 -1.3574436

rt(4,df=3) #four values, Student's t distribution with 2 degrees of freedom

## [1]  1.6952236  0.5058129  0.8800801 -0.9875652

Hypothesis testing: a quick review

The Neyman-Pearson approach

Aim: decide whether there is enough evidence in data to reject the current hypothesis
Steps:
1. Set up null (H0) and alternative (H1) hypothesis
2. Calculate a test statistic \(T_{obs}\) on the data
3. Decide on the basis of \(T_{obs}\) whether to reject H0 in a way that allows to control the false positive rate at a given threshold \(\alpha\):
  - Crucial ingredient: knowledge of the distribution of \(T\) under H0
  - \(\Rightarrow\) the last step requires
    - computing the probability \(p\) for \(T\) to be as extreme as \(T_{obs}\) if H0 holds (the p value of the test)
    - rejecting H0 if \(p<\alpha\)

Example

Mock methylation M-values for 50 individuals with lung cancer and 50 individuals without

mydata=read.table('data/ex_methy.csv',sep=',',header=T); attach(mydata)
boxplot(Mvalue~Outcome)

Is the mean value in controls \(\mu_0\) the same as the mean value in cases \(\mu_1\)?

Hypothesis:
- H0: \(\mu_0-\mu_1=0\)
- H1: \(\mu_0 - \mu_1> 0\) (right unilateral test)
Statistics: \[ T=\frac{\mu_0-\mu_1}{\sigma\sqrt{1/n_0+1/n_1}}, \mbox{ where } \sigma=\sqrt{\frac{(n_0-1)\hat{\sigma_0}^2+(n_1-1)\hat{\sigma_1}^2}{n_0+n_1-2}} \] Under H0, \(T\) follows a Student's t-distribution with \(n-2\) degrees of freedom (assuming equal variance in groups)
p value: \[ p=Pr(T>T_{obs}| \mbox{H0}) \]

s=sqrt((49*sd(Mvalue[Outcome==0])^2+49*sd(Mvalue[Outcome==1])^2)/98)
Tobs=(mean(Mvalue[Outcome==0])-mean(Mvalue[Outcome==1]))/s/sqrt(1/50+1/50)
p=pt(q=Tobs,df=98,lower.tail=F)
unlist(list(Tobs=Tobs,p=p))

##       Tobs          p 
## 1.99514901 0.02440159

See the .Rmd file for this plot's code

Equivalently, using the R function for Student's test:

t.test(x=Mvalue[Outcome==0],y=Mvalue[Outcome==1],alternative='greater', var.equal=T)

## 
##  Two Sample t-test
## 
## data:  Mvalue[Outcome == 0] and Mvalue[Outcome == 1]
## t = 1.9951, df = 98, p-value = 0.0244
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.04644184        Inf
## sample estimates:
## mean of x mean of y 
##  1.792817  1.515892

Here \(p=0.02\) \(\Rightarrow\) we can reject H0 at the significance level \(\alpha=0.05\) (but not at the 0.01 level…)

Decision rules

By rejecting H0 iff \(p<\alpha\) we control the type I error rate (ie the false positive rate) at the \(\alpha\) level:

Let \(T_\alpha\) be the \((1-\alpha)\)-quantile of \(T\) under H0 (the critical value) \(\Rightarrow p<\alpha\) iff \(T_{obs} > T_\alpha\) (case of right unilateral tests) \[ \begin{align} \Rightarrow\mbox{Type I error rate }&=Pr(H0 \mbox{ rejected } | H0 \mbox{ holds })=Pr(p < \alpha | H0) \\ &= Pr( T_{obs} > T_\alpha | H0) = \alpha \end{align} \]

Misconceptions about p-values

p value is not the probability that H0 holds given the data. For this, a prior probability of H0 is needed: \[ Pr(H0|\mbox{data})=\frac{Pr(\mbox{data}|H0)\times Pr(H0)}{Pr(\mbox{data}|H0)\times Pr(H0)+Pr(\mbox{data}|H1)\times Pr(H1)} \]
An alternative to frequentist p value is Bayes factor \(BF=Pr(\mbox{data}|H1)/Pr(\mbox{data}|H0)\): \[ \frac{Pr(H1|\mbox{data})}{Pr(H0|\mbox{data})}=BF\times\frac{Pr(H1)}{Pr(H0)} \]
\(p<\alpha\) suggests there might be something in data, it is not a proof of scientific finding. What matters is the effect size
use p values only to test well defined hypothesis
do not forget confidence intervals
controversial finding: published studies with \(p<\alpha\) are often not replicable (wait a few slides…)

References

Nuzzo, R. (2014). Nature, 506(7487), 150-152.
Senn, S. (2001). Journal of Epidemiology and Biostatistics, 6(2), 193-204.

Statistical Power

	H0 not rejected	H0 rejected
H0 true	\(1-\alpha\)	\(\alpha=\) type I error rate = significance level
H0 not true	\(\beta=\) type II error rate	\(1-\beta=\) Power

\[ \begin{align} \mbox{Power } &= Pr(H0 \mbox{ rejected } | H1 \mbox{ holds}) \\ &= Pr(T > T_{\alpha}| H1 \mbox{ holds}) \end{align} \]

henceforth:

Power is a function of \(\alpha\)
In order to compute the test power one must know the distribution of \(T\) under H1

Back to the example

What is the power of our t-test to reject H0 at confidence level \(\alpha=0.01\) if \(\mu_0-\mu_1=0.28\) (H1)?

Under H1, \(T\) follows a t-distribution with non-centrality \(\frac{\mu_0-\mu_1}{\sigma\sqrt{1/n_0+1/n_1}}\)

m0=mean(Mvalue[Outcome==0]); m1=mean(Mvalue[Outcome==1])
ncp=ncp=(m0-m1)/s/sqrt(1/50+1/50)
pw=pt(q=qt(p=1-0.01,df=98),df=98,ncp=ncp,lower.tail=FALSE)
unlist(list(Power=pw))

##    Power 
## 0.359906

Or equivalently, using the R built-in function:

power.t.test(n=50,delta=0.28,alternative='one.sided',type='two.sample',sig.level=0.01,sd=sd(Mvalue))

## 
##      Two-sample t test power calculation 
## 
##               n = 50
##           delta = 0.28
##              sd = 0.7043647
##       sig.level = 0.01
##           power = 0.3571275
##     alternative = one.sided
## 
## NOTE: n is number in *each* group

The replicability crisis

Positive findings are often not replicable in subsequent studies (in all fields)
This can happen when initial studies are underpowered. To see why, consider the false-positive report probability (FPRP):

\[ \begin{align} FPRP = Pr(H0 | p < \alpha) = & \frac{Pr(p<\alpha|H0)\times Pr(H0)}{Pr(p<\alpha|H0)\times Pr(H0)+Pr(p<\alpha|H1)\times Pr(H1)} \\ = & \frac{\alpha Pr(H0)}{\alpha Pr(H0) + (1-\beta) Pr(H1)} \end{align} \] \(\Rightarrow\) the FPRP increases for lower \(\beta\)s

References

Wacholder,S. (2004). J. Natl Cancer Inst. 96, 434-442
Goodman, S. N. (1992). Stat. Med., 11(7), 875-879.

Probability that H0 is true if test is positive

Testing with simulations: an overview

Simulations are useful when…

…the distribution of \(T\) under H0 is not known (analytic derivation of its densities and CDF not possible, large sample approximations not available or not applicable) but it is possible to sample it
\(\Rightarrow\) we can calculate the empirical p value:

simulate \(N\) realizations of \(T\) under H0:
estimate p with \[ \hat{p} = \frac{\#\{ \mbox{ H0 } T \mbox{ realizations } > T_{obs}\}}{N} \] (change \(>\) for bilateral and unilateral left tests…)

and when..

…the distribution of \(T\) under H1 is not known but it is possible to sample it
\(\Rightarrow\) we can calculate the empirical power:

simulate \(N\) realizations of \(T\) under H1
estimate power with \[ \hat{\mbox{power}} = \frac{\#\{ \mbox{ H1 } T \mbox{ realizations } > T_\alpha\}}{N} \] where \(T_\alpha\) is the \((1-\alpha)\)-quantile of \(T\) under H0. If the distribution of T under H0 is not known, one can sample T under H0 and take the \((1-\alpha)\)-quantile.

Why it works

Law of large numbers

If \(Y_1,\ldots,Y_N\) are an i.i.d. sample from distribution \(G\) then \[ \frac{1}{N}\sum_{i=1}^NY_i \xrightarrow{P} E[Y] \]

In our case we have:

\(T_1\ldots,T_N\) are an i.i.d. sample from the distribution \(G\) of \(T\)
The indicator variables \(\mathbb{1}_{T_1>T_{obs}},\ldots,\mathbb{1}_{T_N>T_{obs}}\) are an i.i.d. sample from a Bernoulli distribution whose expected value is the p value
Hence: \[ \frac{1}{N}\sum_{i=1}^N \mathbb{1}_{T_i>T_{obs}} = \frac{\#\{ T \mbox{ realizations } > T_{obs}\}}{N} \xrightarrow{P} p \]

(Silly) application to our example: empirical p

Tsample0=rt(10000,df=98)
em_p=mean(Tsample0>Tobs)
unlist(list(p=p, empirical_p=em_p))

##           p empirical_p 
##  0.02440159  0.02390000

(Silly) application: empirical power

Tsample1=rt(10000,df=98,ncp=ncp)
Ta_em=quantile(Tsample0,probs=1-0.01)
em_pw=mean(Tsample1>Ta_em)
unlist(list(power=pw, empirical_power=em_pw))

##           power empirical_power 
##        0.359906        0.371300

Data replicates

Simulations are particularly useful for non-parametric tests (\(\neq\) previous example).

In this case, drawing a sample of the statistic under a given hypothesis requires two steps (Monte Carlo simulations):

simulate \(N\) data replicates according to the hypothesis
for each replicate \(i\), calculate the statistic realization \(T_i\)

Then p value and power are estimated as above.

Focus on permutation-like methods

In the following we focus on ways of generating data replicates for testing whether two groups (eg cases vs controls) are the same:

Empirical p values through permutations (classic)
Empirical power through
- Monte Carlo simulations (classic)
- Weighted permutations (more original)

Simulations for case/control testing: empirical p values

Empirical p values trhough permutations

Suppose we have observations \(x_1,\ldots,x_{n_0}\sim F_1\) for controls and observations \(y_1,\ldots,y_{n_1}\sim F_2\) for cases
We want to test \[ \mbox{H0: } F_1=F_2\; \mbox{ versus }\; \mbox{ H1: }F_1\neq F_2 \] (case-control association can be seen as a particular case)
Let \(T\) be our statistics, eg \(T=|\bar{x}-\bar{y}|\)

Key fact: H0 is equivalent to say that the observations are independent from the case/control status
\(\Rightarrow\) we can sample the data distribution under H0 simply by shuffling the case control status: each permutation is a data replicate

Example

Back to the Mvalue example with \(T=\bar{M}_{controls}-\bar{M}_{cases}\)

tobs=abs(mean(Mvalue[Outcome==0])-mean(Mvalue[Outcome==1]))
one.perm=function(x,y){
  perm=sample(x)
  abs(mean(y[perm==0])-mean(y[perm==1]))
}
N=10000
t=replicate(N, one.perm(Outcome,Mvalue))
pp=mean(t>tobs)
unlist(list(p_permutation=pp))

## p_permutation 
##        0.0496

Remarks

Which N? We can compute a CI for p in the usual way:

c(pp-1.96*sqrt(p*(1-p)/N),pp+1.96*sqrt(p*(1-p)/N))

## [1] 0.04657587 0.05262413

Be careful that the smallest possible p is \(1/N\) \(\Rightarrow\) \(N\) must be very large in multiple testing
Useful strategy: start with relatively small \(N\) and increase to large numbers only if p is small and you need more precision
It is a good idea to store all data replicates in a data frame, one replicate per column and then use,

apply(replicates,2,function_for_stat)

Important for the following: permutations preserve the fixed number of cases and controls!

A more interesting application: minimum p value

Suppose we want to test case/control association with a block of 10 SNPs.
We can start by doing a Fisher exact test for each individual SNP, resulting in a p value \(p_i\), and then we take the min p value as our statistic: \[ T=\min_{j=1,\ldots,10} p_j \]
We need to know the distribution of this statistic if we want to calculate a global p value!
This is not a standard distribution, so we turn to permutations

Example

Mock genotypic data for 10 SNPs, 250 cases and 250 controls

We load the data

load('data/ex_snps.Rda')
head(cbind(pheno,geno))

##   pheno snp1 snp2 snp3 snp4 snp5 snp6 snp7 snp8 snp9 snp10
## 1     0    2    0    0    0    0    0    0    1    0     1
## 2     0    1    0    0    0    0    1    1    2    0     0
## 3     0    0    1    1    0    0    0    1    0    1     1
## 4     1    0    0    1    0    1    0    1    2    0     1
## 5     0    0    0    0    0    1    0    0    0    0     1
## 6     0    0    0    0    1    1    0    1    1    0     1

We calculate p values \(p_{obs,j}\) for individual SNPs and compute \(T_{obs}=\min_{j=1,\ldots,10} p_{obs,j}\)

pvalues_obs=c(NA,10)
for(snp in 1:10) pvalues_obs[snp]=fisher.test(pheno,geno[,snp])$p

We calculate p values \(p_j\) for each permutation \(r\) and compute \(T^r=\min_{j=1,\ldots,10} p^r_j\)

T_obs=min(pvalues_obs)
replicates0=matrix(NA,nrow=500,ncol=1000) #stores one permutation per column for later use
pvalues0=matrix(NA,nrow=10,ncol=1000) #stores the SNP p values in columns, one per replicate
for(repl in 1:1000){ 
  perm=sample(pheno)
  replicates0[,repl]=perm
  for(snp in 1:10){
  pvalues0[snp,repl]=fisher.test(perm,geno[,snp])$p
  }
}
T_sample0=apply(pvalues0,2,min)

At last we calculate the empirical p value for \(T\) as \[ \frac{\# \{T^r < T_{obs}\}}{\# \mbox{ replicates}} \]

mean(T_sample0<T_obs)

## [1] 0.008

Simulations for case/control testing: empirical power

Empirical power with a focus on large scale association studies

Apart from a few simple situations (eg t-test, \(\chi^2\)-test…), the distribution of the test statistics \(T\) under H1 is not known, neither exactly nor asymptotically
This is typical of multi markers methods for GWAS and other large scale genetic studies
In these cases power is usually estimated by analysing many data replicates under H1
In association studies H1 is the assumption of a disease model ie a penetrance function
In symbols:
- \(n_1=\) number of cases, \(n_0\) number of controls, \(n=n_0+n_1\)
- \(X^j_i=\) genotype of individual \(i\) for market \(j\)
- \(Y_i=\) phenotype (case/control) of \(i\)
- H1: penetrance function \(\pi_i=Pr(Y_i=1|X_i)\)

Which simulations under H1?

Important constraint: simulations should all preserve the fixed number of cases and control
Naif protocol:
- draw individual phenotypes given their real genotypes by sampling \(Y_i|X_i\sim\mathcal{B}(\pi_i)\)
- Problem: \(\pi_i\) is small \(\Rightarrow\) we end up with very few cases!
Two alternative approaches:
1. Simulation of both genotypes and phenotypes for a very large population:
  - Simulate \(m\) genotypic datasets, with \(m>>n\)
  - For each individual of this large population draw \(Y_i|X_i\) as above
  - Hope that \(m\) is large enough to have enough cases
2. Simulation of genotypes given fixed phenotypes: \(X_i|Y_i\) has a multinomial distribution with parameter \(\propto \pi_i\times Pr(X_i)\). Approach implemented in HAPGEN [Zu, S. (2011). Bioinformatics, 27(16), 2304-2305]
Common problem: a genotypic model is needed! Not a problem for independent markers but what about mimicking LD patterns?

Problems with first approach (i)

Slow, requires a lot of memory

Ref: Alarcon, F. and Perduca, V. Work in progress

Problems with first approach (ii)

Attention: raising the disease prevalence is not an option as this will bias the power estimates

Ref: Perduca, V. (2012). Hum Hered, 73:95-104.

A possible solution: weighted permutations

waffect is a recently introduced algorithm that affects the phenotype of each individual
- conditionally on genotype
- according to the fixed disease model
- in a way that preserves the fixed number of cases
Advantages:
- fast, only phenotypes are simulated
- one can use original genotypic datasets
- R package available
- One simple function, only input are the probabilities \(\pi_i\)

Ref: Perduca, V. (2012). Hum Hered, 73:95-104.

How it works

waffect is based on a Belief Propagation algorithm for a simple HMM-like graphical model
idea is to sample recursively phenotypes. Let \(Z_i\) the number of cases affected for individuals \(1,\ldots,i\), then the steps are
- draw \(Y_1\) conditionally on the fixed total number of cases \(Z_n=n_1\)
- draw \(Y_2\) conditionnally on \(Z_1\) and \(Z_n=n_1\)
- drawn \(Y_3\) conditionally on \(Z_2\) and \(Z_n=n_1\)
- etc

Example

We calculate the empirical power of the test looking for a signal in the 10 SNPs by simulating phenotypes under H1 using waffect
Fr H1 we take an additive disease model with prevalence \(f_0=0.1\) and OR \(=1.6\) on SNP 8: \[ \pi_i= \frac{1}{1+\exp(-\log(f_0)-\log(OR)\times snp_8)} \]
Recall that
- \(T=\min_{j=1,\ldots,10} p_j\)
- \(\hat{\mbox{power}} = \frac{\#\{ \mbox{ H1 } T \mbox{ realizations } > T_\alpha\}}{N}\).

We first define a disease model for H1 and calculate \(T_\alpha\) using the sample of \(T\) under H0 previously calculated

pi=1/(1+exp(-log(0.1)-log(1.6)*geno[,8]))
threshold=quantile(T_sample0,probs=0.05)

We use waffect to simulate replicates with 250 cases and 250 controls according to \(\pi_i\)

require(waffect)
pvalues1=matrix(NA,nrow=10,ncol=1000) #stores the SNP p values in columns, one per replicate
for(repl in 1:1000){ 
  wperm=waffect(prob=pi,count=250,label=c(1,0))
  for(snp in 1:10){
    pvalues1[snp,repl]=fisher.test(wperm,geno[,snp])$p
  }
}
T_sample1=apply(pvalues1,2,min)

We estimate the power

mean(T_sample1<threshold)

## [1] 0.585

Applications to the multiple testing problem

Family Wise Error Rate

Standard individual marker approaches for GWAS consist in testing for each SNP \[ \mbox{H0}_j: \mbox{ SNP}_j \mbox{ not associated vs H1}_j: \mbox{SNP}_j \mbox{ associated } \]
If there are \(p\) SNPs and each test is done at threshold \(\alpha\), expect \(p\times \alpha\) false positives (type I errors) when no SNP is associated. Eg \(p=10^6\) and \(\alpha=0.05 \Rightarrow\) in average 50000 false positives!

* The Family Wise Error Rate (FWER) is the probability of having at least one false positive when H0 is true for all \(p\) tests.

If all tests are independent (not realistic because of LD): \[ FWER = Pr(\exists ~1 \leq j \leq p \mbox{ s.t. } H0_{j} \mbox{ rejected } | \forall~ j\, H0_{j} \mbox{ holds}) = 1-(1-\alpha)^p \]
If the test are dependent: FWER \(<1-(1-\alpha)^p\)
We avoid type I error rate inflation by doing each test at a level \(\alpha\) such that the FWER is \(\leq \alpha'\) where \(\alpha'\) is a fixed threshold, for instance \(\alpha'=0.05\)

Bonferroni correction

If there are \(p\) SNPs, do each test at the threshold \[ \alpha=\frac{\alpha'}{p}, \] then FWER \(\leq \alpha\).

Proof.

\[ \begin{align} \mbox{FWER} & = Pr(\bigcup_{j=1}^p(H0_j \mbox{ rejected }|H0_j \mbox{ holds})\\ & \leq \sum_{j=1}^p Pr(H0_j \mbox{ rejected }|H0_j \mbox{ holds}) \\ & = \sum_{j=1}^p \frac{\alpha'}{p} \\ & = \alpha'. \end{align} \]

The problem with Bonferroni correction

\(\frac{\alpha'}{p}\) is very small \(\Rightarrow\) power loss
Loss in power is even more important if tests are dependent: if there are only \(q\) truly independent tests out of \(p\), ideally one would need to take \[ \frac{\alpha'}{q} > \frac{\alpha'}{p} \]

Example: independent SNPs

The SNPs in the previous datasets are independent
To estimate the FWER we use the previously generated data replicates under H0
For each replicate, we count a false positive if at least one p value is \(<\) 0.005 ie if their min is \(<0.05\)
We show that if we test each SNP at the threshold \[ \alpha=\frac{\alpha'}{p}=\frac{0.05}{10}=0.005, \]
the FWER estimate is \(0.05\)

mean(apply(pvalues0,2,min)<0.05/10)

## [1] 0.062

Example: correlated SNPs

Same analysis with 10 correlated SNPs:

load('data/ex_snps_d.Rda')
require(corrplot); corrplot(corr=cor(geno_d),method='square',type='upper',tl.pos='n')

pvalues0_d=matrix(NA,nrow=10,ncol=1000) #stores the SNP p values in columns, one per replicate
for(repl in 1:1000){ 
  perm=sample(pheno)
  for(SNP in 1:10){
    pvalues0_d[SNP,repl]=fisher.test(perm,geno_d[,SNP])$p
  }
}
T_sample0_d=apply(pvalues0_d,2,min)

This time the FWER estimate is lower than \(\alpha'=0.05\):

mean(apply(pvalues0_d,2,min)<0.05/10)

## [1] 0.029

Several approaches to overcome the problem

Several methods have been proposed to choose \(q\):
- \(q=\) number of LD blocks
- eigenvalues of the correlation matrix of the SNP allele counts
Several studies based on HapMap data suggested a threshold of \(5\times 10^{-8}\) for European studies
In the following we discuss an alternative based on permutations

Ref. Sham, P. C. (2014). Nat. Rev. Genet, 15(5), 335-346.

Permutation procedure for FWER control

Obtain the empirical distribution of \(T=\min_{j=1,\ldots,10} p_j\):

Take the \(\alpha'\)-quantile \(T_{\alpha'}\)
Test each SNP at the threshold \(T_{\alpha'}\)

Why it works

Proof. \[ \begin{align} FWER & = Pr_{H0}(\exists j \mbox{ s.t. } H0_{j} \mbox{ rejected }) \\ & = Pr_{H0}(\exists j \mbox{ s.t. } p_{obs,j}< T_{\alpha'}) \\ & = Pr_{H0}(\min_j p_{obs,j} < T_\alpha') \\ & = Pr_{H0}(T<T_{\alpha'})=\alpha'. \end{align} \]

Equivalently, define for each SNP \(j\) the empirical adjusted p value \[ p^{adj}_j=\frac{r+1}{N+1} \] where \(r\) is the number of \(T^r\) smaller than \(p_{obs,j}\), and reject H0\(_j\) if \(p^{adj}_j<\alpha'\)

Example: back to the 10 correlated SNPs

We compute \(T_{\alpha'}\)

Tap=quantile(T_sample0_d,probs = 0.05)
unlist(list(threshold_perm=Tap))

## threshold_perm.5% 
##       0.008563128

The estimate of the FWER is \(\alpha'=0.05\)

mean(apply(pvalues0_d,2,min)<Tap)

## [1] 0.05

Contents

What this presentation is not about

Hypothesis testing: a quick review

The Neyman-Pearson approach

Example

Decision rules

Misconceptions about p-values

Statistical Power

Back to the example

The replicability crisis

Probability that H0 is true if test is positive

Testing with simulations: an overview

Simulations are useful when…

and when..

Why it works

(Silly) application to our example: empirical p

(Silly) application: empirical power

Data replicates

Focus on permutation-like methods

Simulations for case/control testing: empirical p values

Empirical p values trhough permutations

Example

Remarks

A more interesting application: minimum p value

Example

Simulations for case/control testing: empirical power

Empirical power with a focus on large scale association studies

Which simulations under H1?

Problems with first approach (i)

Problems with first approach (ii)

A possible solution: weighted permutations

How it works

Example

Applications to the multiple testing problem

Family Wise Error Rate

Bonferroni correction

The problem with Bonferroni correction

Example: independent SNPs

Example: correlated SNPs

Several approaches to overcome the problem

Permutation procedure for FWER control

Why it works

Example: back to the 10 correlated SNPs