Class-wise Generalization Error: an Information-Theoretic analysis
Laakom, Firas; Gabbouj, Moncef; Schmidhuber, Jürgen; Bu, Yuheng (2025)
Laakom, Firas
Gabbouj, Moncef
Schmidhuber, Jürgen
Bu, Yuheng
2025
Transactions on Machine Learning Research
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202508128200
https://urn.fi/URN:NBN:fi:tuni-202508128200
Kuvaus
Peer reviewed
Tiivistelmä
Existing generalization theories for supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all different classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of the model for each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using recent advances in conditional mutual information bound, which enables practical evaluation. We empirically validate our proposed bounds in various neural networks and show that they accurately capture the complex class-generalization behavior. Moreover, we demonstrate that the theoretical tools developed in this work can be applied in several other applications.
Kokoelmat
- TUNICRIS-julkaisut [24324]
