Overview

Papers divided by categories

1. Training

1.1 SGD

Fig1

The Heavy-Tail Phenomenon in SGD

Tags: #theoretical_understanding

2. Models

2.1 Probabilistic Models

Fig1

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

Tags: #model_improvement

3. Techniques

3.1 Knowledge Distillation

Fig1

A statistical perspective on distillation

Tags: #theoretical_understanding