Feasibility of Learning

Feasibility of Learning

我们希望找到无限逼近实际函数f的假设函数g

  • $g(hypothesis) \approx f(real function)$ inside $D$: sure!

  • $g(hypothesis) \approx f(real function)$ outside $D$: No! (what we want)

Hoeffding’s Inequality

  • valid for all N and $\epsilon$
  • does not depend on $\mu$, no need to know $\mu$
  • larger sample size N or looser gap $\epsilon$ => higher probability for $\nu \approx \mu$

Connection to Learning

  • if large N, can probably infer unknow $| h(x) \neq f(x) |$ by know $| h(x_{n} \neq y_n)|$
  • $E_{in}(h)$: 在已知的样本里,假设函数与实际函数不相等的概率。$E_{out}(h)$: 在所有样本里,上述二者不相等的概率。
  • $E_{in}(h)$ small is a good choice, but $E_{in}(h)$ is not always small.

multiple h

掷硬币,求出现反面的概率。

bad sample

  • 掷骰子150次,每次掷5下,有超过99%的概率会出现连续5次都是正面。
  • 这就是一个bad sample,因为其使得$E_{in}(h)$ far away $E_{out}(h)$
    • $E_{in}(h)$ = 0
    • $E_{out}(h)$ = $1/2$

对于M个假设函数,出现 bad sample 的概率:
feasibility03

  • 所有假设函数都是安全的
  • 最优P为lowest $E_{in}(h_{m})$
Donate article here