Robust decision trees under adversarial attacks
Mansikkamäki, Onni (2022)
Mansikkamäki, Onni
2022
Tieto- ja sähkötekniikan kandidaattiohjelma - Bachelor's Programme in Computing and Electrical Engineering
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2022-05-27
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202205265272
https://urn.fi/URN:NBN:fi:tuni-202205265272
Tiivistelmä
Normal decision trees are effective but simple machine learning models that are prone to adversarial attacks. Nevertheless, the operation of decision trees under adversarial attacks has received relatively little research, and robust decision tree algorithms that can withstand these attacks have only been developed in recent years. The purpose of this work is to determine how accurately, robustly, and time-efficiently different robust decision tree models perform under the attack compared to each other, and how accurately they perform compared to non-robust decision tree models under attack.
Adversarial attacks create adversarial examples, that allow the attacker to try and affect the decision tree’s ability to perform accurately in given classification task. Adversarial examples are data samples that are designed to appear normal to the ordinary decision tree, and whose numerical properties have been modified to trick the decision tree into making an incorrect classification decision. Robust decision trees are decision tree models designed to withstand adversarial attacks.
The work first introduces the essential backgrounds of the topic of the work. The background covers machine learning, supervised learning, decision trees and tree ensembles, and learning procedure of a robust decision tree model. Once the necessary background information is covered, methods as well as topics related to the research phase are reviewed, which will allow the performance of decision trees as well as robust decision trees to be examined under adversarial attacks. The research part of the thesis presents the parts of the research and finally the results produced.
In the research part of the thesis, source code retrieved from GitHub is executed. The source code used 16 different datasets with two different result categories. The datasets were retrieved from the OpenML. There were 14 datasets of structural datasets, and the other 2 datasets were image datasets. Only binary classification tasks were assigned to the studied trees and tree ensembles. The source code performed attacks, built trees and tree ensembles, and produced result images related to the accuracy, robustness, and time efficiency of the models.
The results produced by the research part of the thesis show, that all robust decision tree models used in the study perform classification much more accurately under attack, than other non-robust decision tree models used in the study. The results show that the GROOT tree, which performed by two percentage points to the best decision tree in accuracy, was the most time efficient robust decision tree for almost all datasets. The results show that of the robust decision tree ensembles, GROOT forest, which performs the best in terms of prediction accuracy, was the robust decision tree ensemble that performs the best in terms of time efficiency for almost all datasets. The results also shows that for the image datasets, the robust decision tree models that participated in the robustness part of the study were almost equally robust for all images in both image datasets.
Adversarial attacks create adversarial examples, that allow the attacker to try and affect the decision tree’s ability to perform accurately in given classification task. Adversarial examples are data samples that are designed to appear normal to the ordinary decision tree, and whose numerical properties have been modified to trick the decision tree into making an incorrect classification decision. Robust decision trees are decision tree models designed to withstand adversarial attacks.
The work first introduces the essential backgrounds of the topic of the work. The background covers machine learning, supervised learning, decision trees and tree ensembles, and learning procedure of a robust decision tree model. Once the necessary background information is covered, methods as well as topics related to the research phase are reviewed, which will allow the performance of decision trees as well as robust decision trees to be examined under adversarial attacks. The research part of the thesis presents the parts of the research and finally the results produced.
In the research part of the thesis, source code retrieved from GitHub is executed. The source code used 16 different datasets with two different result categories. The datasets were retrieved from the OpenML. There were 14 datasets of structural datasets, and the other 2 datasets were image datasets. Only binary classification tasks were assigned to the studied trees and tree ensembles. The source code performed attacks, built trees and tree ensembles, and produced result images related to the accuracy, robustness, and time efficiency of the models.
The results produced by the research part of the thesis show, that all robust decision tree models used in the study perform classification much more accurately under attack, than other non-robust decision tree models used in the study. The results show that the GROOT tree, which performed by two percentage points to the best decision tree in accuracy, was the most time efficient robust decision tree for almost all datasets. The results show that of the robust decision tree ensembles, GROOT forest, which performs the best in terms of prediction accuracy, was the robust decision tree ensemble that performs the best in terms of time efficiency for almost all datasets. The results also shows that for the image datasets, the robust decision tree models that participated in the robustness part of the study were almost equally robust for all images in both image datasets.
Kokoelmat
- Kandidaatintutkielmat [8253]