0%

# Background

1. So far, the most striking successes in deep learning have involved discriminative models.
2. Deep generative models have had less of an impact, due to the difficulty of approximating many intractable probabilistic computations that arise in maximum likelihood estimation and related strategies, and due to difficulty of leveraging the benefits of piecewise linear units in the generative context.

# Methods

$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log (1-D(G(\boldsymbol{z})))]$

To learn the generator’s distribution $$p_{g}$$ over data $$x$$, we define a prior on input noise variables $$p_z(z)$$, then represent a mapping to data space as $$G(z; \theta_g)$$, here is a differentiable function represented by a multilayer perceptron with parameters $$\theta_g$$. We also define a second multilayer perceptron $$D(x, \theta_d)$$ that outputs a single scalar. $$D(x)$$ represents the probability that $$x$$ came from the data rather than $$p_{g}$$ . We train $$D$$ to maximize the probability of assigning the correct label to both training examples and samples from $$G$$. We simultaneously train $$G$$ to minimize $$\log (1-D(G(\boldsymbol{z}))$$.

In practice, equation 1 may not provide sufficient gradient for $$G$$ to learn well. Rather than training G to minimize $$\log (1-D(G(\boldsymbol{z}))$$, we can train $$G$$ to maximize $$log(D(G(z))$$.

# Conclusion

This paper has demonstrated the viability of the adversarial modeling framework, suggesting that these research directions could prove useful.