Overview
Papers divided by categories
1. Training
1.1 SGD

The Heavy-Tail Phenomenon in SGD
Tags: #theoretical_understanding
2. Models
2.1 Probabilistic Models

Tags: #model_improvement
3. Techniques
3.1 Knowledge Distillation

A statistical perspective on distillation
Tags: #theoretical_understanding