Overview
Papers divided by categories
1. Training
1.1 SGD
The Heavy-Tail Phenomenon in SGD
Tags: #theoretical_understanding
2. Models
2.1 Probabilistic Models
Tags: #model_improvement
3. Techniques
3.1 Knowledge Distillation
A statistical perspective on distillation
Tags: #theoretical_understanding