Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
  •   Etusivu
  • Trepo
  • Opinnäytteet - ylempi korkeakoulututkinto
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Understanding the learning process in deep neural networks with Information Bottleneck

Liu, Huixia (2024)

 
Avaa tiedosto
LiuHuixia.pdf (3.243Mt)
Lataukset: 



Liu, Huixia
2024

Tietojenkäsittelyopin maisteriohjelma - Master's Programme in Computer Science
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2024-05-08
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202405035253
Tiivistelmä
Deep Neural Networks (DNNs) have become a driving force in artificial intelligence, achieving remarkable performance in areas like computer vision and natural language processing. However, understanding how DNNs learn remains a challenge.

The information bottleneck (IB) principle proposes a compelling framework to understand how classification DNNs learn efficient representations of given data. The IB principle posits that a good representation of the input data should retain relevant information to the task while discarding irrelevant information. The information plane, on the other hand, visualizes the flow of relevant information throughout a DNN as it learns. Here, the relevant information is quantified by mutual information (MI). In the information plane, each point represents a pair of MI values corresponding to an intermediate layer in the network and a specific training epoch, where MI estimation is an essential prerequisite step in our analyses. In this thesis, we explore the learning process within DNNs during training under the IB principle.

We engage multiple MI estimators to enhance the reliability of our findings and mitigate potential biases. We employ information plane plots to analyze the learning process of diverse DNN models. Our research unveils a two-phase learning process in the output layer, characterized by fitting and compression stages. However, the dynamics in hidden layers exhibit more complexity, demanding further exploration. Besides, our findings indicate a potential link between the compression phase and higher prediction accuracy. Further investigation is needed to establish a causal relationship.
Kokoelmat
  • Opinnäytteet - ylempi korkeakoulututkinto [40068]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste