Stay organized with collections
Save and categorize content based on your preferences.
GANs have a number of common failure modes. All of these common problems are
areas of active research. While none of these problems have been completely
solved, we'll mention some things that people have tried.
Vanishing Gradients
Research has suggested that if your
discriminator is too good, then generator training can fail due to vanishing
gradients. In effect,
an optimal discriminator doesn't provide enough information for the generator to
make progress.
Attempts to Remedy
Wasserstein loss: The Wasserstein loss is designed to prevent
vanishing gradients even when you train the discriminator to optimality.
Usually you want your GAN to produce a wide variety of outputs. You want, for
example, a different face for every random input to your face generator.
However, if a generator produces an especially plausible output, the generator
may learn to produce only that output. In fact, the generator is always trying
to find the one output that seems most plausible to the discriminator.
If the generator starts producing the same output (or a small
set of outputs) over and over again, the discriminator's best
strategy is to learn to always reject that output. But if the next
generation of discriminator gets stuck in a local minimum and doesn't find the
best strategy, then it's too easy for the next generator iteration to find the
most plausible output for the current discriminator.
Each iteration of generator over-optimizes for a particular discriminator, and
the discriminator never manages to learn its way out of the trap. As a result
the generators rotate through a small set of output types. This form of GAN
failure is called mode collapse.
Attempts to Remedy
The following approaches try to force the generator to broaden its scope by
preventing it from optimizing for a single fixed discriminator:
Wasserstein loss: The Wasserstein loss alleviates mode collapse by
letting you train the discriminator to optimality without worrying
about vanishing gradients. If the discriminator doesn't get stuck in local
minima, it learns to reject the outputs that the generator stabilizes on. So
the generator has to try something new.
Unrolled GANs: Unrolled GANs use
a generator loss function that incorporates not only the current
discriminator's classifications, but also the outputs of future
discriminator versions. So the generator can't over-optimize for a single
discriminator.
Failure to Converge
GANs frequently fail to converge, as discussed in the module on
training.
Attempts to Remedy
Researchers have tried to use various forms of regularization to improve GAN
convergence, including:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-02-26 UTC."],[[["\u003cp\u003eGANs suffer from common failure modes like vanishing gradients, mode collapse, and failure to converge, all of which are areas of active research.\u003c/p\u003e\n"],["\u003cp\u003eVanishing gradients occur when a discriminator becomes too good, hindering the generator's progress, and can be addressed with techniques like Wasserstein loss or modified minimax loss.\u003c/p\u003e\n"],["\u003cp\u003eMode collapse happens when a generator repeatedly produces the same or similar outputs, and solutions include using Wasserstein loss or unrolled GANs to encourage output diversity.\u003c/p\u003e\n"],["\u003cp\u003eFailure to converge is a common issue, and researchers are exploring regularization methods such as adding noise to discriminator inputs or penalizing discriminator weights to improve stability.\u003c/p\u003e\n"]]],[],null,["GANs have a number of common failure modes. All of these common problems are\nareas of active research. While none of these problems have been completely\nsolved, we'll mention some things that people have tried.\n\nVanishing Gradients\n\n[Research](https://arxiv.org/pdf/1701.04862.pdf) has suggested that if your\ndiscriminator is too good, then generator training can fail due to [vanishing\ngradients](https://wikipedia.org/wiki/Vanishing_gradient_problem). In effect,\nan optimal discriminator doesn't provide enough information for the generator to\nmake progress.\n\nAttempts to Remedy\n\n- **Wasserstein loss** : The [Wasserstein loss](/machine-learning/gan/loss) is designed to prevent vanishing gradients even when you train the discriminator to optimality.\n- **Modified minimax loss** : The [original GAN\n paper](http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf) proposed a [modification to minimax loss](/machine-learning/gan/loss) to deal with vanishing gradients.\n\nMode Collapse\n\nUsually you want your GAN to produce a wide variety of outputs. You want, for\nexample, a different face for every random input to your face generator.\n\nHowever, if a generator produces an especially plausible output, the generator\nmay learn to produce *only* that output. In fact, the generator is always trying\nto find the one output that seems most plausible to the discriminator.\n\nIf the generator starts producing the same output (or a small\nset of outputs) over and over again, the discriminator's best\nstrategy is to learn to always reject that output. But if the next\ngeneration of discriminator gets stuck in a local minimum and doesn't find the\nbest strategy, then it's too easy for the next generator iteration to find the\nmost plausible output for the current discriminator.\n\nEach iteration of generator over-optimizes for a particular discriminator, and\nthe discriminator never manages to learn its way out of the trap. As a result\nthe generators rotate through a small set of output types. This form of GAN\nfailure is called **mode collapse**.\n\nAttempts to Remedy\n\nThe following approaches try to force the generator to broaden its scope by\npreventing it from optimizing for a single fixed discriminator:\n\n- **Wasserstein loss** : The [Wasserstein loss](/machine-learning/gan/loss) alleviates mode collapse by letting you train the discriminator to optimality without worrying about vanishing gradients. If the discriminator doesn't get stuck in local minima, it learns to reject the outputs that the generator stabilizes on. So the generator has to try something new.\n- **Unrolled GANs** : [Unrolled GANs](https://arxiv.org/pdf/1611.02163.pdf) use a generator loss function that incorporates not only the current discriminator's classifications, but also the outputs of future discriminator versions. So the generator can't over-optimize for a single discriminator.\n\nFailure to Converge\n\nGANs frequently fail to converge, as discussed in the module on\n[training](/machine-learning/gan/training).\n\nAttempts to Remedy\n\nResearchers have tried to use various forms of regularization to improve GAN\nconvergence, including:\n\n- **Adding noise to discriminator inputs** : See, for example, [Toward Principled\n Methods for Training Generative Adversarial\n Networks](https://arxiv.org/pdf/1701.04862.pdf).\n- **Penalizing discriminator weights** : See, for example, [Stabilizing Training\n of Generative Adversarial Networks through\n Regularization](https://arxiv.org/pdf/1705.09367.pdf)."]]