Multi-center validation of an artificial intelligence system for detection of COVID-19 on chest radiographs in symptomatic patients

Kuo, Michael D; Chiu, Keith W H; Wang, David S; Larici, Anna Rita; Poplavskiy, Dmytro; Valentini, Adele; Napoli, Alessandro; Borghesi, Andrea; Ligabue, Guido; Fang, Xin Hao B; Wong, Hing Ki C; Zhang, Sailong; Hunter, John R; Mousa, Abeer; Infante, Amato; Elia, Lorenzo; Golemi, Salvatore; Leung Ho P, Yu; Hui, Christopher K M; Erickson, Bradley J

doi:10.1007/s00330-022-08969-z

Objectives While chest radiograph (CXR) is the first-line imaging investigation in patients with respiratory symptoms, differentiating COVID-19 from other respiratory infections on CXR remains challenging. We developed and validated an AI system for COVID-19 detection on presenting CXR. Methods A deep learning model (RadGenX), trained on 168,850 CXRs, was validated on a large international test set of presenting CXRs of symptomatic patients from 9 study sites (US, Italy, and Hong Kong SAR) and 2 public datasets from the US and Europe. Performance was measured by area under the receiver operator characteristic curve (AUC). Bootstrapped simulations were performed to assess performance across a range of potential COVID-19 disease prevalence values (3.33 to 33.3%). Comparison against international radiologists was performed on an independent test set of 852 cases. Results RadGenX achieved an AUC of 0.89 on 4-fold cross-validation and an AUC of 0.79 (95%CI 0.78-0.80) on an independent test cohort of 5,894 patients. Delong's test showed statistical differences in model performance across patients from different regions (p < 0.01), disease severity (p < 0.001), gender (p < 0.001), and age (p = 0.03). Prevalence simulations showed the negative predictive value increases from 86.1% at 33.3% prevalence, to greater than 98.5% at any prevalence below 4.5%. Compared with radiologists, McNemar's test showed the model has higher sensitivity (p < 0.001) but lower specificity (p < 0.001). Conclusion An AI model that predicts COVID-19 infection on CXR in symptomatic patients was validated on a large international cohort providing valuable context on testing and performance expectations for AI systems that perform COVID-19 prediction on CXR.

Multi-center validation of an artificial intelligence system for detection of COVID-19 on chest radiographs in symptomatic patients / Kuo, Michael D; Chiu, Keith W H; Wang, David S; Larici, Anna Rita; Poplavskiy, Dmytro; Valentini, Adele; Napoli, Alessandro; Borghesi, Andrea; Ligabue, Guido; Fang, Xin Hao B; Wong, Hing Ki C; Zhang, Sailong; Hunter, John R; Mousa, Abeer; Infante, Amato; Elia, Lorenzo; Golemi, Salvatore; Yu, Leung Ho P; Hui, Christopher K M; Erickson, Bradley J. - In: EUROPEAN RADIOLOGY. - ISSN 0938-7994. - 33:1(2023), pp. 23-33. [10.1007/s00330-022-08969-z]

Multi-center validation of an artificial intelligence system for detection of COVID-19 on chest radiographs in symptomatic patients

Kuo, Michael D;Chiu, Keith W H;Wang, David S;Larici, Anna Rita;Poplavskiy, Dmytro;Valentini, Adele;Napoli, Alessandro;Borghesi, Andrea;Ligabue, Guido^{Membro del Collaboration Group};Fang, Xin Hao B;Wong, Hing Ki C;Zhang, Sailong;Hunter, John R;Mousa, Abeer;Infante, Amato;Elia, Lorenzo;Golemi, Salvatore;Yu, Leung Ho P;Hui, Christopher K M;Erickson, Bradley J

2023

Abstract

Objectives While chest radiograph (CXR) is the first-line imaging investigation in patients with respiratory symptoms, differentiating COVID-19 from other respiratory infections on CXR remains challenging. We developed and validated an AI system for COVID-19 detection on presenting CXR. Methods A deep learning model (RadGenX), trained on 168,850 CXRs, was validated on a large international test set of presenting CXRs of symptomatic patients from 9 study sites (US, Italy, and Hong Kong SAR) and 2 public datasets from the US and Europe. Performance was measured by area under the receiver operator characteristic curve (AUC). Bootstrapped simulations were performed to assess performance across a range of potential COVID-19 disease prevalence values (3.33 to 33.3%). Comparison against international radiologists was performed on an independent test set of 852 cases. Results RadGenX achieved an AUC of 0.89 on 4-fold cross-validation and an AUC of 0.79 (95%CI 0.78-0.80) on an independent test cohort of 5,894 patients. Delong's test showed statistical differences in model performance across patients from different regions (p < 0.01), disease severity (p < 0.001), gender (p < 0.001), and age (p = 0.03). Prevalence simulations showed the negative predictive value increases from 86.1% at 33.3% prevalence, to greater than 98.5% at any prevalence below 4.5%. Compared with radiologists, McNemar's test showed the model has higher sensitivity (p < 0.001) but lower specificity (p < 0.001). Conclusion An AI model that predicts COVID-19 infection on CXR in symptomatic patients was validated on a large international cohort providing valuable context on testing and performance expectations for AI systems that perform COVID-19 prediction on CXR.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Rivista
	
				EUROPEAN RADIOLOGY
			
	N° del Volume
	
				33
			
	Fascicolo
	
				1
			
	Pagina iniziale
	
				23
			
	Pagina finale
	
				33
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s00330-022-08969-z
			
	Codice WoS
	
				WOS:000819892400001
			
	Codice Scopus
	
				2-s2.0-85133274530
			
	Codice PubMed
	
				35779089
			
	Citazione
	
				Multi-center validation of an artificial intelligence system for detection of COVID-19 on chest radiographs in symptomatic patients / Kuo, Michael D; Chiu, Keith W H; Wang, David S; Larici, Anna Rita; Poplavskiy, Dmytro; Valentini, Adele; Napoli, Alessandro; Borghesi, Andrea; Ligabue, Guido; Fang, Xin Hao B; Wong, Hing Ki C; Zhang, Sailong; Hunter, John R; Mousa, Abeer; Infante, Amato; Elia, Lorenzo; Golemi, Salvatore; Yu, Leung Ho P; Hui, Christopher K M; Erickson, Bradley J. - In: EUROPEAN RADIOLOGY. - ISSN 0938-7994. - 33:1(2023), pp. 23-33. [10.1007/s00330-022-08969-z]
			
	Tutti gli autori
	
						Kuo, Michael D; Chiu, Keith W H; Wang, David S; Larici, Anna Rita; Poplavskiy, Dmytro; Valentini, Adele; Napoli, Alessandro; Borghesi, Andrea; Ligabue...espandi
						
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
2023 eur rad s00330-022-08969-z.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Dimensione 2.56 MB Formato Adobe PDF Visualizza/Apri	2.56 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris