Scaling Laws

1. Neural Scaling Laws

1.1. Kaplan et al. Scaling Law

The rationale is that larger models have higher sample efficiency.

1.2. Chinchilla Scaling Law

References

  1. Kaifeng Lyu's PPT