What is P-Value?

Why it is so successful in science?

In some sense it offers a first line of defense against being fooled by randomness, separating signal from noise.

Definition

  • p-values tell you how surprising the data is, assuming there is no effect.
  • formal definition:
    A p-value is the probability of getting the observed or more extreme data, assuming the null hypothesis is true.

在假设原假设(H0)正确时,出现观察到的现状或更差的情况的概率

Example

Does driving while calling increase the risk of a car accident?
设计实验,一组司机开车打电话,一组司机开车不打电话,然后对比发生事故的概率。

  • The difference is never exactly zero. A difference of e.g., 0.11 means:
    1. Probably just random noise.
    2. Probably a real difference

Null hypthesis

假设零假设为真,即符合中心为0的正态分布。

  • Assuming null hypthesis is true, means most of the data will fall between these two critical values.
    Screen Shot 2018-09-21 at 14.06.49.png
    Screen Shot 2018-09-21 at 14.09.17.png

Important notes

  1. A p-value is the probability of the data, not the probability of a theory.
  2. You can’t get the probability the null hypothesis is true, given the data, from a p-value.
  3. A single p-value is not enough to declare a scientific discovery; only when we can repeatedly observe something, we can consider it a reliable observation.

How to use pValue correctly?

  1. Use p-values as a rule to guide behavior in the long run.
  2. 不能说,因为$p < x$,所以理论正确。应该说,因为$p < x$,所以结果符合预期。

Hwo to calculate pValue?

以抛硬币实验为例完成假设检验

  • 假设:硬币是公平的
  • 检验:认为假设是成立的,然后扔十次,看结果与假设是否相符

反复扔硬币符合二项分布),也就是:

其中, n代表扔硬币的次数,$\mu$代表“花”朝上的概率。

在我们认为硬币是公平的前提下,扔1000次硬币应该符合以下分布:

Screen Shot 2018-09-23 at 14.19.13.png

若1000次抛硬币实验结果为正面朝上530次,则pVlue取530以及更极端的点组成组成区间。
Screen Shot 2018-09-23 at 14.23.13.png

什么是显著水平$\alpha$?

抛一千次硬币,我们认为出现530次正面之后,硬币是不公平的,还是出现580次正面之后,硬币是不公平的,这是一个主观标准。

我们一般认为:

就可以认为假设是不正确的。

表示出来如下图所示:
Screen Shot 2018-09-23 at 14.27.03.png
我们可以认为刚开始的假设错的很“显著”,也就是“硬币是不公平的”。

Donate article here