To utilize a deep learning (DL) model trained via federated learning (FL), a method of collaborative training without sharing patient data, to delineate institutional differences in clinician diagnostic paradigms and disease epidemiology in retinopathy of prematurity (ROP).
Evaluation of a diagnostic test or technology SUBJECTS, PARTICIPANTS, AND/OR CONTROLS: 5,245 patients with wide-angle retinal imaging from the neonatal intensive care units of 7 institutions as part of the Imaging and Informatics in ROP (i-ROP) study. Images were labeled with the clinical diagnosis of plus disease (plus, pre-plus, no plus) that was documented in the chart, and a reference standard diagnosis (RSD) determined by three image-based ROP graders and the clinical diagnosis.
Demographics (birthweight [BW], gestational age [GA]), and clinical diagnoses for all eye exams were recorded from each institution. Using a FL approach, a DL model for plus disease classification was trained using only the clinical labels. The three class probabilities were then converted into a vascular severity score (VSS) for each eye exam, as well as an “institutional VSS” in which the average of the VSS values assigned to patients’ higher severity (“worse”) eyes at each exam was calculated for each institution.
We compared demographics, clinical diagnosis of plus disease, and institutional VSS between institutions using the McNemar Bowker test, two-proportion Z test and one-way ANOVA with post-hoc analysis by Tukey-Kramer test. Single regression analysis was performed to explore the relationship between demographics and VSS.
We found that the proportion of patients diagnosed with pre-plus disease varied significantly between institutions (p<0.00l). Using the DL-derived VSS trained on the data from all institutions using FL, we observed differences in the institutional VSS, as well as level of vascular severity diagnosed as no plus (p<0.001) across institutions. A significant, inverse relationship between the institutional VSS and the mean GA was found (p=0.049, adjusted R=0.49).
A DL-derived ROP VSS developed without sharing data between institutions using FL identified differences in the clinical diagnosis of plus disease, and overall levels of ROP severity between institutions. FL may represent a method to standardize clinical diagnosis and provide objective measurement of disease for image-based diseases.

Copyright © 2022. Published by Elsevier Inc.

Author