Soren Mindermann*, Muhammed Razzak*, Winnie Xu*, Andreas Kirsch, Mrinank Sharma, Adrien Morisot, Aidan N. Gomez, Sebastian Farquhar, Jan Brauner, Yarin Gal
We introduce Goldilocks Selection, a technique for faster model training which selects a sequence of training points that are “just right”. We propose an information-theoretic acquisition function — the reducible held-out loss — and compute it with a small proxy model — GoldiProx — to efficiently choose training points that maximize information about a validation set. We show that the selected sequence not only prioritizes learnable, yet information rich data relevant to the evaluation task but also effectively transfers across architectures and vision tasks.
International Conference on Machine Learning (ICML), 2022 [Spotlight].