Assessing clusters of comorbidities in rheumatoid arthritis: a machine learning approach

Daniel H. Solomon; Hongshu Guan; Fredrik Johansson; Leah Santacroce; Wendi Malley; Lin Guo; Heather J. Litman

doi:10.1186/s13075-023-03191-8

Assessing clusters of comorbidities in rheumatoid arthritis: a machine learning approach
Journal article, 2023

Background: Comorbid conditions are very common in rheumatoid arthritis (RA) and several prior studies have clustered them using machine learning (ML). We applied various ML algorithms to compare the clusters of comorbidities derived and to assess the value of the clusters for predicting future clinical outcomes. Methods: A large US-based RA registry, CorEvitas, was used to identify patients for the analysis. We assessed the presence of 24 comorbidities, and ML was used to derive clusters of patients with given comorbidities. K-mode, K-mean, regression-based, and hierarchical clustering were used. To assess the value of these clusters, we compared clusters across different ML algorithms in clinical outcome models predicting clinical disease activity index (CDAI) and health assessment questionnaire (HAQ-DI). We used data from the first 3 years of the 6-year study period to derive clusters and assess time-averaged values for CDAI and HAQ-DI during the latter 3 years. Model fit was assessed via adjusted R 2 and root mean square error for a series of models that included clusters from ML clustering and each of the 24 comorbidities separately. Results: 11,883 patients with RA were included who had longitudinal data over 6 years. At baseline, patients were on average 59 (SD 12) years of age, 77% were women, CDAI was 11.3 (SD 11.9, moderate disease activity), HAQ-DI was 0.32 (SD 0.42), and disease duration was 10.8 (SD 9.9) years. During the 6 years of follow-up, the percentage of patients with various comorbidities increased. Using five clusters produced by each of the ML algorithms, multivariable regression models with time-averaged CDAI as an outcome found that the ML-derived comorbidity clusters produced similarly strong models as models with each of the 24 separate comorbidities entered individually. The same patterns were observed for HAQ-DI. Conclusions: Clustering comorbidities using ML algorithms is not computationally complex but often results in clusters that are difficult to interpret from a clinical standpoint. While ML clustering is useful for modeling multi-omics, using clusters to predict clinical outcomes produces models with a similar fit as those with individual comorbidities.

Author

Daniel H. Solomon

Brigham and Women's Hospital

Harvard Medical School

Hongshu Guan

Brigham and Women's Hospital

Fredrik Johansson

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Other publications Research

Leah Santacroce

Brigham and Women's Hospital

Wendi Malley

CorEvitas LLC

Lin Guo

CorEvitas LLC

Heather J. Litman

CorEvitas LLC

Arthritis Research and Therapy

1478-6354 (ISSN) 14786362 (eISSN)

Vol. 25 1 224

Subject Categories (SSIF 2011)

Rheumatology and Autoimmunity

DOI

10.1186/s13075-023-03191-8

Publication data connected to DOI

PubMed

37993918

More information

Latest update

12/1/2023

Assessing clusters of comorbidities in rheumatoid arthritis: a machine learning approach Journal article, 2023

Author

Daniel H. Solomon

Hongshu Guan

Fredrik Johansson

Leah Santacroce

Wendi Malley

Lin Guo

Heather J. Litman

Arthritis Research and Therapy

Subject Categories (SSIF 2011)

DOI

PubMed

More information

Latest update

Assessing clusters of comorbidities in rheumatoid arthritis: a machine learning approach
Journal article, 2023