Comparison of machine learning methods in the early identification of vasculitides, myositides and glomerulonephritides
Ryyppö, Rasmus; Häyrynen, Sergei; Joutsijoki, Henry; Juhola, Martti; Seppänen, Mikko R. J. (2023-01-08)
Ryyppö, Rasmus
Häyrynen, Sergei
Joutsijoki, Henry
Juhola, Martti
Seppänen, Mikko R. J.
08.01.2023
Computer Methods and Programs in Biomedicine
107917
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202311109561
https://urn.fi/URN:NBN:fi:tuni-202311109561
Kuvaus
Peer reviewed
Tiivistelmä
Background: Rare disease diagnoses are often delayed by years, including multiple doctor visits, and potentialimprecise or incorrect diagnoses before receiving the correct one. Machine learning could solve this problem byflagging potential patients that doctors should examine more closely.Methods: Making the prediction situation as close as possible to real situation, we tested different masking sizes.In the masking phase, data was removed, and it was applied to all data points following the first rare diseasediagnosis, including the day when the diagnosis was received, and in addition applied to selected number of daysbefore initial diagnosis. Performance of machine learning models were compared with positive predictive value(PPV), negative predictive value (NPV), prevalence PPV (pPPV), prevalence NPV (pNPV), accuracy (ACC) andarea under the receiver operation characteristics curve (AUC).Results: XGBoost had PPVs over 90 % in all masking settings, and InceptionVasGloMyotides had most of the PPVsover 90 %, but not as consistently. When the prevalence of the diseases was considered XGBoost achieved highestvalue of 8.8 % in binary classification with 30 days masking and InceptionVasGloMyotides achieved the bestvalue of 6 % in the binary classification as well, but with 2160 days and 4320 days masking. ACC were varyingbetween 89 % and 98 % with XGBoost and InceptionVasGloMyotides having variation between 79 % and 94 %.AUC on the other hand varied between 72.6 % and 94.5 % with InceptionVasGloMyotides and for XGBoost itvaried between 69.9 % and 96.4 %.Conclusions: XGBoost and InceptionVasGloMyotides could successfully predict rare diseases for patients at least30 days prior to initial rare disease diagnose. In addition, we managed to build performative custom deeplearning model.
Kokoelmat
- TUNICRIS-julkaisut [23753]