An Evaluation of One Class Classifier on Gene Expression Data
Xu, Haifeng (2019)
Xu, Haifeng
2019
Tietotekniikan DI-ohjelma - Degree Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2019-08-22
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-201908122866
https://urn.fi/URN:NBN:fi:tuni-201908122866
Tiivistelmä
It is not rare that medical data has imbalanced classes. This problem causes many difficulties when diagnosing rare diseases or cancer subtypes by machine learning tools, since traditional binary or multi-class classifiers lack the ability to classify imbalanced data. Therefore, One-Class Classifiers(OCC), the machine learning methods that only use data from one class,becomes one possible option. Our study evaluates ν-SVM, one of the most commonly used One-Class methods, on four microarray datasets of Breast Cancer and Diffuse large B-cell lymphoma (DLBCL). Each cancer is labelled into different subtypes. We compared OCC with binary SVM and studied how the imbalance between the classes affects the results. The results show that ν-SVM performs better than binary SVM when the data classes are extremely imbalanced on these datasets.