What is P-Value?

Why it is so successful in science?

In some sense it offers a first line of defense against being fooled by randomness, separating signal from noise.

p-values tell you how surprising the data is, assuming there is no effect.
formal definition:

A p-value is the probability of getting the observed or more extreme data, assuming the null hypothesis is true.

在假设原假设（H0）正确时，出现观察到的现状或更差的情况的概率

Does driving while calling increase the risk of a car accident? 设计实验，一组司机开车打电话，一组司机开车不打电话，然后对比发生事故的概率。

The difference is never exactly zero. A difference of e.g., 0.11 means:
1. Probably just random noise.
2. Probably a real difference

假设零假设为真，即符合中心为0的正态分布。

Assuming null hypthesis is true， means most of the data will fall between these two critical values.

A p-value is the probability of the data, not the probability of a theory.
You can’t get the probability the null hypothesis is true, given the data, from a p-value. \[P(D*|H) ≠ P(H|D)\]
A single p-value is not enough to declare a scientific discovery; only when we can repeatedly observe something, we can consider it a reliable observation.

以抛硬币实验为例完成假设检验 - 假设：硬币是公平的 - 检验：认为假设是成立的，然后扔十次，看结果与假设是否相符

反复扔硬币符合二项分布），也就是： \[X\sim B(n,\mu)\]

其中， n代表扔硬币的次数，\(\mu\)代表“花”朝上的概率。

在我们认为硬币是公平的前提下，扔1000次硬币应该符合以下分布：

\[X\sim B(1000, 0.5)\]

若1000次抛硬币实验结果为正面朝上530次，则pVlue取530以及更极端的点组成组成区间。

抛一千次硬币，我们认为出现530次正面之后，硬币是不公平的，还是出现580次正面之后，硬币是不公平的，这是一个主观标准。

我们一般认为: \[\text {p-value}\leq 0.05\] 就可以认为假设是不正确的。

表示出来如下图所示: 我们可以认为刚开始的假设错的很“显著”，也就是“硬币是不公平的”。