Influence of Loss Function on Left Ventricular Volume and Ejection Fraction Estimation in Deep Neural Networks


Project Introduction


Quantification of the left ventricle shape is crucial in evaluating cardiac function from 2D echocardiographic images. This study investigates the applicability of established loss functions when optimising the U-Net model for 2D echocardiographic left ventricular segmentation.
Our results indicate loss functions are a significant component for optimal left ventricle volume measurements when established segmentation metrics could be imperceptible.




IMAGE Dataset

Fig. 1. An example from the dataset



Dataset


1224 Videos of the apical 4-chamber echocardiographic view, acquired between 2015 and 2016, were extracted from Imperial College Healthcare NHS Trust’s echocardiogram database. The acquisition of the images was performed by experienced echocardiographers and according to standard protocols, using ultrasound equipment from GE and Philips manufacturers.
Ethical approval was obtained from the Health Regulatory Agency (Integrated Research Application System identifier 243023). From these videos, a total of 2600 images were sampled from different points in the cardiac cycle. Each image underwent labelling by one individual from a pool of experts using our in-house online labelling platform (https://unityimaging.net). This dataset was used for model developments (i.e., training and validation).
The testing comprised 100 videos, from consecutive studies conducted over 3 working days in 2019, at least 3 years away from the date of collection for the model development dataset. Our previous deep neural network (Lane et al., 2021) was used to identify the end-diastolic and end-systolic frames of the 100 videos. These selected frames were used for the human expert annotations. Each of the 200 resulting images was then labelled by 11 experts, using the platform. The concensus of the expert was finally used as the ground-truth in the testing dataset.



Method


U-Net was implemented in TensorFlow and was trained on an Nvidia RTX3090 GPU. The loss functions chosen for experimentation are common for image segmentation tasks and were selected from three categories:

• Distribution-based loss: Binary cross entropy (BCE) loss
• Region-based loss: Dice loss, Tvsersky loss, Focal Tvsersky loss
• Compound loss: BCE+Dice

After training the model for different loss functions, each model was evaluated using the established evaluation metrics for segmentation tasks (i.e., Dice Coefficient and Hausdorff Distance) and domain specific metrics (i.e., volume and EF measurements) by averaging the error across all images in the testing dataset. The volume was computed using the Simpson’s method.

The average Volume Error is calculated using Cartesian pixel coordinates by computing the difference between the volume of the ground-truth and the predicted endocardial border.
EF was estimated by dividing the stroke volume (i.e., the difference between end-diastolic and end-systolic volumes) by the end-diastolic volume.

Architecture

Fig. 2. U-Net Model


Experiment Results and Discussion

table1 table1

Figure 1: Depicting end-diastolic(left) and end-systolic(right) echo images with groundtruth(red) and predicted endocardium borders borders(green), and volume error of 17.60ml with EF error of 2.38%.




Table 1 highlights the miniscule variation in DSC and HD across all loss functions despite vast differences in Average Volume Error. Hence, excellent scores for DSC and HD do not imply the most optimal volume computation for EF. For instance, when using ’BCE and Dice loss’, a large volume error is observed whereas DSC is insensitive to the shape of the predicted LV border.

Interestingly, some large errors in the volume measurements may be cancelled out in the EF measurements. One example of such scenario is shown in Figure 1 (Depicting end-diastolic(left) and end-systolic(right) echo images with groundtruth(red) and predicted endocardium borders borders(green), and volume error of 17.60ml with EF error of 2.38%).

Our results demonstrate evaluating model performance based on established metrics alone is insufficient when estimating LV volume.

Our future research centres around further investigation of the generalisability of a loss function across many echocardiographic image datasets for training a segmentation model with the objective of computing an improved approximation for EF in clinical practice.