Visualization of Semantic Segmentation Networks
Kankainen, Ossi (2020)
Kankainen, Ossi
2020
Teknisten tieteiden kandidaattiohjelma - Degree Programme in Engineering Sciences, BSc (Tech)
Tekniikan ja luonnontieteiden tiedekunta - Faculty of Engineering and Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2020-05-19
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202004294363
https://urn.fi/URN:NBN:fi:tuni-202004294363
Tiivistelmä
The development of visualization methods for deep convolutional neural networks supports their design and helps in their adaption also to critical applications. Semantic segmentation has many such heavily regulated application areas such as medical imaging and autonomous vehicles. Thus, there is a clear need to find visualization methods that can be applied to neural network models used in semantic segmentation.
In this thesis, solutions are sought to this need by studying methods that have been used with generative models having a similar network structure than semantic segmentation models. Two different structures, autoencoder and adversarial networks, are commonly used in semantic segmentation models. They both utilize a concept of latent space that is a compact representation of data. Due to its compactness, the latent space is also useful in visualization of models. Based on literature can be find five different latent space visualization methods for generative models. In the experiments of this work those methods are applied to two different semantic segmentation models to see how they adapt for them.
Received results show how latent space projections from different dimensionality reduction techniques can be used to illustrate what features a semantic segmentation model uses when it forms clusters of data. In addition, the capability of the model to generalize for new data can be assessed based on the compactness of the projections. Examining the predicted output masks of the training samples is a good way to get an initial view of the model performance. Also new samples can be interpolated from the latent space. By observing feature changes in the outputs that model gives to them, one can obtain a more accurate view of how features change between different areas in the latent space. However, a problem is that semantic segmentation models do not force latent variables to be meaningful for data generation like generative networks do. For this reason, latent space is typically sparser which appeared in the experiments so that the nearest neighbour was same for many interpolation points. Thus, examining nearest neighbours turned out not to be a useful visualization method for semantic segmentation models. Also attribute vector arithmetic cannot be applied directly to semantic segmentation networks since the definition of attribute vector is not straightforward for them.
In this thesis, solutions are sought to this need by studying methods that have been used with generative models having a similar network structure than semantic segmentation models. Two different structures, autoencoder and adversarial networks, are commonly used in semantic segmentation models. They both utilize a concept of latent space that is a compact representation of data. Due to its compactness, the latent space is also useful in visualization of models. Based on literature can be find five different latent space visualization methods for generative models. In the experiments of this work those methods are applied to two different semantic segmentation models to see how they adapt for them.
Received results show how latent space projections from different dimensionality reduction techniques can be used to illustrate what features a semantic segmentation model uses when it forms clusters of data. In addition, the capability of the model to generalize for new data can be assessed based on the compactness of the projections. Examining the predicted output masks of the training samples is a good way to get an initial view of the model performance. Also new samples can be interpolated from the latent space. By observing feature changes in the outputs that model gives to them, one can obtain a more accurate view of how features change between different areas in the latent space. However, a problem is that semantic segmentation models do not force latent variables to be meaningful for data generation like generative networks do. For this reason, latent space is typically sparser which appeared in the experiments so that the nearest neighbour was same for many interpolation points. Thus, examining nearest neighbours turned out not to be a useful visualization method for semantic segmentation models. Also attribute vector arithmetic cannot be applied directly to semantic segmentation networks since the definition of attribute vector is not straightforward for them.
Kokoelmat
- Kandidaatintutkielmat [8997]