Understanding deep neural networks through over-parameterization.
Date:
The learning dynamics of neural networks trained by gradient descent are captured by the so-called neural tangent kernel (NTK) in the infinite-width limit. The NTK has been a powerful tool for researchers to understand the optimization and generalization of over-parameterized networks. In this talk, the foundation of the NTK in addition to its application to orthogonally-initialized networks and ultra-wide graph networks will be introduced.