Classification Of Lymph Node Metastases In Breast Cancer With Features From Tissue Images Using Machine Learning Techniques
Bartaula, Jyoti Prasad (2017)
Bartaula, Jyoti Prasad
2017
Master's Degree Programme in Bioinformatics
Lääketieteen ja biotieteiden tiedekunta - Faculty of Medicine and Life Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2017-06-14
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:uta-201707072185
https://urn.fi/URN:NBN:fi:uta-201707072185
Tiivistelmä
Determining the metastatic involvement of lymph node is very crucial in designing the treatment plans in breast cancer. Traditional way of detecting the lymph node metastases involves manual histopathological examination of specimen, which is subjective and tiresome process. In this thesis, an automated system to classify lymph node metastases in breast cancer with features from digitized tissue images is proposed. The proposed system consists of applying different machine learning algorithms for classification together with various feature selection techniques.
minimum Redundancy Maximum Relevance(mRMR), wrapper methods, area under the ROC curve of random forest (AUCRF), and least absolute shrinkage and selection operator (LASSO) were implemented to select the most relevant features among 214 original features. Various classification models were learned using selected features to classify between metastatic and
non-metastatic samples. Among the models learned, random forest model showed to perform better than others.
The results obtained from this thesis show encouraging signs for automated classification of lymph node metastases in breast cancer with features from digitized tissues images with the application of machine learning techniques. Also, results show that feature selection helps in
removing irrelevant and redundant features, which not only deceases the computational time of classification algorithms but can also enhances the classification performance.
minimum Redundancy Maximum Relevance(mRMR), wrapper methods, area under the ROC curve of random forest (AUCRF), and least absolute shrinkage and selection operator (LASSO) were implemented to select the most relevant features among 214 original features. Various classification models were learned using selected features to classify between metastatic and
non-metastatic samples. Among the models learned, random forest model showed to perform better than others.
The results obtained from this thesis show encouraging signs for automated classification of lymph node metastases in breast cancer with features from digitized tissues images with the application of machine learning techniques. Also, results show that feature selection helps in
removing irrelevant and redundant features, which not only deceases the computational time of classification algorithms but can also enhances the classification performance.