# Think bayesian & Statistics review

## Main principles

- Use prior konwledge
- Chose answer that explains observations the most
- Avoid extra assumptions

### example

A main is running, why?

- He is in a hurry
- He is doing exports (use principle 2 to exclude, does not waer a sports suit, contradicts the data)
- He always runs (use principle 3 to exclude)
- He saw a dragon (use principle 1 to exclude)

## Probability

for throw a dice, the probability of one side is 1/6

## Random variable

### Discrete

Probability Mass Function(PMF)

### Continuous

Probability Density Function(PDF)

### Independence

X and Y are independent if:

- P(x,y) -> Joint
- P(x) -> Marinals

## Conditional probability

Probability of X given that Y happened:

### Chain rule

### Sum rule

## Total probability

- $B_1, B_2 \cdots $ 两两互斥，即 $B_i \cap B_j = \emptyset$ ，$i \neq j$, i,j=1，2，….，且$P(B_i)>0$,i=1,2,….;
- $B_1 \cup B_2 \cdots = \Omega$ ，则称事件组 $B_1 \cup B_2 \cdots$ 是样本空间 $\Omega$ 的一个划分

## Bayes theorem

- $\theta$: parameters
- $X$: observations
- $P(\theta|X)$: Posterior
- $P(X)$: Evidence
- $P(X|\theta)$: Likelyhood
- $P(\theta)$: Prior

## Bayesian approach to statistics

### Frequentist

- Objective
- $\theta$ is fixed, X is random
- training

Maximum Likelyhood (they try to find the parameters theta that maximize the likelihood, the probability of their data given parameters)

### Bayesian

- Subjective
- X is random, $\theta$ is fixed
- Training(Bayes theorem)

what Bayesians will try to do is they would try to compute the posterior, the probability of the parameters given the data. - Classification
- Training:
- Prediction:

- On-line learning (get posterior)