Cluster Analysis Reveals Subgroups with Different Risk Profiles and Sickness Absence Patterns in an Occupational Health Cohort
Anttila, Anniina; Nuutinen, Mikko; Leskelä, Riikka-Leena; van Gils, Mark; Pekki, Anu; Sauni, Riitta (2025-07-29)
Anttila, Anniina
Nuutinen, Mikko
Leskelä, Riikka-Leena
van Gils, Mark
Pekki, Anu
Sauni, Riitta
29.07.2025
JOURNAL OF OCCUPATIONAL REHABILITATION
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202509018625
https://urn.fi/URN:NBN:fi:tuni-202509018625
Kuvaus
Peer reviewed
Tiivistelmä
Purpose: Using unsupervised and supervised machine learning methods, we aimed to identify clinically relevant groups of employees with similar characteristics and analyze the association of long and short sickness absence periods with these groups. Methods: The participants were 12,099 employees of various occupations in Finnish companies. The data comprised 104 variables from medical records including data on sickness absences and a questionnaire used between 2011 and 2019 in health examinations. The latent dimensions for the employees were defined by principal component analysis to reduce the number of variables. Clusters were calculated using the K-means algorithm from datapoints expressed by the resulting five principal components. Logistic regression analyses were used to assess the associations of the clusters with long (> 30 days) and repetitive short (1–10 days) sickness absence (SA) episodes. Results: Employees in cluster one indicated positive managerial performance and workplace atmosphere, and employees had the least of both short and long SA. Cluster two indicated deficiencies related to managerial performance and workplace atmosphere. Cluster three had deficiencies mainly related to mood and depression and cluster four had cardiovascular diseases. Employees in cluster five reported many symptoms, especially dizziness and sensory symptoms, and had the highest occurrence of repetitive short SA. Cluster six indicated deficiencies related to work ability and had the highest occurrence of a long SA episode during follow-up. Conclusion: Unsupervised and supervised machine learning methods identified six clinically coherent employee clusters, providing information on typical combinations of characteristics and risk profiles of sickness absence.
Kokoelmat
- TUNICRIS-julkaisut [22195]
