Pre-training in convolutional neural networks
Liu, Hao (2016)
Liu, Hao
2016
Master's Degree Programme in Information Technology
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2016-12-07
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201611244781
https://urn.fi/URN:NBN:fi:tty-201611244781
Tiivistelmä
The objective of this thesis is to study unsupervised pre-training in convolutional neural networks (CNNs) with a special kind of arti cial neural network (ANN) called autoencoder. Unsupervised pre-training was proposed to solve the problem of training a deep neural network with more than one hidden layer by pre-training it in a greedy layer-wise manner. Although it started to lose its popularity after some advances in optimization and initialization methods, it is still worthwhile to study their e ect on modern neural networks like CNNs.
Two new methods of applying autoencoders or their variants for pre-training a CNN are proposed in this work. One is embedding the classi er as the encoder part of an autoencoder to reduce network complexity. In a conventional pre-training method applying autoencoders, the output layer of a classi er is usually randomly initialized. However, the proposed method initializes all the layers by unsupervised pre-training. The other model applies labeled data to build a supervised variant of autoencoder.
From experiments conducted on MNIST and CIFAR datasets, unsupervised pretraining still help with improving network performance of CNNs. The embedded classifier gives compatible results to the conventional pre-training method while the supervised variant performs mostly better.
Two new methods of applying autoencoders or their variants for pre-training a CNN are proposed in this work. One is embedding the classi er as the encoder part of an autoencoder to reduce network complexity. In a conventional pre-training method applying autoencoders, the output layer of a classi er is usually randomly initialized. However, the proposed method initializes all the layers by unsupervised pre-training. The other model applies labeled data to build a supervised variant of autoencoder.
From experiments conducted on MNIST and CIFAR datasets, unsupervised pretraining still help with improving network performance of CNNs. The embedded classifier gives compatible results to the conventional pre-training method while the supervised variant performs mostly better.