Identifying risk factors of long sickness absences: a registry-based study using explainable AI methods
Anttila, Anniina; Nuutinen, Mikko; Leskelä, Riikka Leena; Van Gils, Mark; Sauni, Riitta (2025-11-04)
Anttila, Anniina
Nuutinen, Mikko
Leskelä, Riikka Leena
Van Gils, Mark
Sauni, Riitta
04.11.2025
Bmj Open
e101921
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2025112610909
https://urn.fi/URN:NBN:fi:tuni-2025112610909
Kuvaus
Peer reviewed
Tiivistelmä
Objective To identify and explore variable groups and individual predictors of long sickness absences outside of well-known predictors such as service use and previous sickness absence using machine learning, explainable artificial intelligence methods and a submodel approach. Design Retrospective study of prospectively collected registry data on sickness absences and a questionnaire used in health examinations. Setting Electronic medical record data of one large occupational health service provider in Finland. Participants 11 533 employees of various occupations who, between 2011 and 2019, had at least once completed a health questionnaire that could be linked to service usage data and who had not had their initial health check within 1 year before or 3months after completing the questionnaire. Primary outcome measures To identify predictors of at least one long sickness absence period (≥30 days) during a 2-year follow-up. Results The highest area under the receiver operating characteristic curve (AUROC) values among the submodel groups were for the sickness absence and service use submodels (0.68–0.74). The AUROC values for the submodels of sociodemographic factors, health habits or diseases data category ranged from 0.55 to 0.67 and from 0.55 to 0.67 for the submodels of questionnaire data. The AUROC value of the ensemble model that combined all submodels was 0.79 (95% CI 0.788 to 0.794). The most important factors predicting long sickness absences based on the submodels were reported pain, number of symptoms and diseases, body mass index and short sleep duration. Additionally, several work and mental health-related variables increased the risk of long sickness absence. Conclusions Other variables besides service use and sickness absence increase the accuracy in predicting long sickness absence and providing information for planning interventions that could have a beneficial impact on work disability risk.
Kokoelmat
- TUNICRIS-julkaisut [23424]
