How to compute p-values for a bootstrap distribution

I was recently asked the following question:

I am using bootstrap simulations to compute critical values for a statistical test. Suppose I have test statistic for which I want a p-value. How do I compute this?

The answer to this question doesn't require knowing anything about bootstrap methods. An equivalent formulation (for a one-sided p-value) is "How do I count the number of values in a vector that are greater than a given value?" For p-values, you assume that the vector contains random values sampled from the null distribution.

You can find a fully-worked example of a bootstrap computation for a paired t test on pages 11–14 of my SAS Global Forum paper, "Rediscovering SAS/IML Software: Modern Data Analysis for the Practicing Statistician." The empirical p-value is computed on page 14.

Here's one way think about this problem. Suppose a vector, s, contains random values from the null distribution. In a bootstrap situation, this means that s1, s2, ..., sN are the bootstrapped statistics, where si is the statistic computed on the ith bootstrap sample, and where each bootstrap sample is sampled from the null distribution (that is, according to the null hypothesis). Let s0 be the value of the test statistic. Then a one-sided empirical p-value for s0 is computed as follows:

The simplest computation is to apply the definition of a p-value. To do this, count the number of values (statistics) that are greater than or equal to the observed value, and divide by the number of values. In code, pval = sum(s >= s0)/N;
The previous formula has a bias due to finite sampling. Some authors suggest the modification pval = (1+sum(s >= s0))/(N+1); For example, see Davison and Hinkley (1997), Bootstrap Methods and their Application, p. 141. Obviously, the two formulas are essentially the same when the number of values, N, is large.

See also my article on computing empirical estimates from the data.

Incidentally, if you'd like to run the bootstrap computations yourself, you can download the airlines data that I used in my SAS Global Forum paper.

Tags Bootstrap and Resampling

How to compute p-values for a bootstrap distribution

How to compute p-values for a bootstrap distribution

Recommend

魁北克语言改革法案可能会促进当地翻译需求

马斯克：将在未来三周内推出基于“纯视觉”的FSD V.9

平行链拍卖对波卡生态有什么影响？

peer-review

PoS：乌托邦式的幻想，并不是现实的选择

5月回顾：极端行情DeFi表现稳定，老牌项目迎来升级后“第二春”

I will be moving to the Netherlands

Everything You Need to Know About iOS 15

幽默：不要相信 10 倍程序员/设计师/领导者！

从 CRUD 迁移到事件溯源的秘诀 - eventstore

About Joyk