Deep Learning Classification Methods for Complex Disorders
Smolander, Johannes (2016)
Smolander, Johannes
2016
Biotekniikan koulutusohjelma
Luonnontieteiden tiedekunta - Faculty of Natural Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2016-04-06
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201603233752
https://urn.fi/URN:NBN:fi:tty-201603233752
Tiivistelmä
In this MSc thesis we studied how deep learning methods can be applied to class prediction of complex disorder tasks with gene expression data. Microarrays and sequencing can generate representations of expression of different biomolecular molecules in samples. With appropriate machine learning methods and data we can build classifiers that can handle various classisscation tasks with many practical applications.
Deep belief networks are our principal deep learning models. We carried out tests to see how they perform alone and with support vector machines combined. We compared two different optimization algorithms, backpropagation and resilient backpropagation, that are used at the fine-tuning stage of learning. The three example data sets are composed of lung cancer, breast cancer and inflammatory bowel disease samples. For assessment of performance we used leave-one-out cross-validation with accuracy, sensitivity and specificity as performance metrics. Moreover, we computed the standard error of the mean for each metric. In order to make the results more credible and interesting, we compared them with similar previous studies.
Our cross-validation results and comparison with previous studies show that we achieved good or excellent performances for most of the tasks. A remarkable aspect is that we in general omitted prior use of feature selection and dimensionality reduction that have been used previously almost invariably. The resilient backpropagation algorithm worked with whole microarray data sets, whereas the basic backpropagation algorithm worked well with whole RNA-Seq data sets and feature selected data.
Deep belief networks are our principal deep learning models. We carried out tests to see how they perform alone and with support vector machines combined. We compared two different optimization algorithms, backpropagation and resilient backpropagation, that are used at the fine-tuning stage of learning. The three example data sets are composed of lung cancer, breast cancer and inflammatory bowel disease samples. For assessment of performance we used leave-one-out cross-validation with accuracy, sensitivity and specificity as performance metrics. Moreover, we computed the standard error of the mean for each metric. In order to make the results more credible and interesting, we compared them with similar previous studies.
Our cross-validation results and comparison with previous studies show that we achieved good or excellent performances for most of the tasks. A remarkable aspect is that we in general omitted prior use of feature selection and dimensionality reduction that have been used previously almost invariably. The resilient backpropagation algorithm worked with whole microarray data sets, whereas the basic backpropagation algorithm worked well with whole RNA-Seq data sets and feature selected data.