Feature Diversity in Neural Networks : Theory and Algorithms
Laakom, Firas (2024)
Laakom, Firas
Tampere University
2024
Tieto- ja sähkötekniikan tohtoriohjelma - Doctoral Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Väitöspäivä
2024-03-18
Julkaisun pysyvä osoite on
https://urn.fi/URN:ISBN:978-952-03-3355-3
https://urn.fi/URN:ISBN:978-952-03-3355-3
Tiivistelmä
The main strength of neural networks lies in their ability to generalize to unseen data. ‘Why and when do they generalize well?’ are two extremely important questions for a full understanding of this phenomenon and for developing better and more robust models. Several studies have explored these questions from different perspectives and proposed multiple measures/bounds that correlate well with generalization. The dissertation proposes a new perspective by focusing on the ‘feature diversity’ within the hidden layers. From this standpoint, neural networks are seen as a two-stage process, with the first stage being feature (representation) learning through the intermediate layers, followed by the final prediction layer. Empirically, it has been observed that learning a rich and diverse set of features is critical for achieving top performance. Yet, no theoretical justification exists. In this dissertation, we tackle this problem by theoretically analyzing the effect of the features’ diversity on the generalization performance. Specifically, we derive several Rademacher-based rigorous bounds for neural networks in different contexts and we demonstrate that, indeed, having more diverse features correlates well with better generalization performance. Moreover, inspired by these theoretical findings, we propose a new set of data-dependent diversity-inducing regularizers and we present an extensive empirical study confirming that the proposed regularizers enhance the performance of several state-of-the-art neural network models in multiple tasks. Beyond standard neural networks, we also explore different diversity-promoting strategies in different contexts, e.g., Energy-Based Models, autoencoders, and bag-of-features pooling layers and we show that learning diverse features helps consistently.
Kokoelmat
- Väitöskirjat [4943]