Semi-supervised learning with natural language processing for right ventricle classification in echocardiography - a scalable approach
Artikel i vetenskaplig tidskrift, 2022

We created a deep learning model, trained on text classified by natural language processing (NLP), to assess right ventricular (RV) size and function from echocardiographic images. We included 12,684 examinations with corresponding written reports for text classification. After manual annotation of 1489 reports, we trained an NLP model to classify the remaining 10,651 reports. A view classifier was developed to select the 4-chamber or RV-focused view from an echocardiographic examination (n = 539). The final models were two image classification models trained on the predicted labels from the combined manual annotation and NLP models and the corresponding echocardiographic view to assess RV function (training set n = 11,008) and size (training set n = 9951. The text classifier identified impaired RV function with 99% sensitivity and 98% specificity and RV enlargement with 98% sensitivity and 98% specificity. The view classification model identified the 4-chamber view with 92% accuracy and the RV-focused view with 73% accuracy. The image classification models identified impaired RV function with 93% sensitivity and 72% specificity and an enlarged RV with 80% sensitivity and 85% specificity; agreement with the written reports was substantial (both κ = 0.65). Our findings show that models for automatic image assessment can be trained to classify RV size and function by using model-annotated data from written echocardiography reports. This pipeline for auto-annotation of the echocardiographic images, using a NLP model with medical reports as input, can be used to train an image-assessment model without manual annotation of images and enables fast and inexpensive expansion of the training dataset when needed.



Right ventricle



Machine learning

Natural language processing



Eva Hagberg

Göteborgs universitet

David Hagerman Olzon

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik, Signalbehandling

Richard Johansson

Chalmers, Data- och informationsteknik, Data Science

Göteborgs universitet

Nasser Hosseini

Sahlgrenska universitetssjukhuset

Jan Liu

Student vid Chalmers

Elin Björnsson

Student vid Chalmers

Jennifer Alvén

Digitala bildsystem och bildanalys

Göteborgs universitet

Ola Hjelmgren

Sahlgrenska universitetssjukhuset

Göteborgs universitet

Computers in Biology and Medicine

0010-4825 (ISSN)

Vol. 143 105282


Informations- och kommunikationsteknik

Hälsa och teknik

Livsvetenskaper och teknik (2010-2018)


Språkteknologi (språkvetenskaplig databehandling)


Radiologi och bildbehandling

Datorseende och robotik (autonoma system)

Medicinsk bildbehandling



Mer information

Senast uppdaterat