(Semi-)automatic fetus segmentation and visualisation in 3D ultrasound: a promising perspective

(1)

University of Amsterdam

Master Thesis

(Semi-)automatic fetus segmentation and

visualisation in 3D ultrasound:

a promising perspective

Author:

Romy Meester

Supervisor and examiner: Dhr. dr. R.G. (Rob) Belleman Assessor: Dhr. dr. J.A. (Jaap) Kaandorp

A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computational Science

in the Visualisation Lab Informatics Institute

(2)

Declaration of Authorship

I, Romy Meester, declare that this thesis, entitled ‘(Semi-)automatic fetus segmentation and visualisation in 3D ultrasound: a promising perspective’ and the work presented in it are my own. I confirm that:

This work was done wholly or mainly while in candidature for a research degree at

the University of Amsterdam.

Where any part of this thesis has previously been submitted for a degree or any

other qualification at this University or any other institution, this has been clearly stated.

Where I have consulted the published work of others, this is always clearly

at-tributed.

Where I have quoted from the work of others, the source is always given. With the

exception of such quotations, this thesis is entirely my own work.

I have acknowledged all main sources of help.

Where the thesis is based on work done by myself jointly with others, I have made

clear exactly what was done by others and what I have contributed myself.

Date: 25 November 2020

(3)

“Always be curious about the next step on your journey.”

Romy Meester

(4)

UNIVERSITY OF AMSTERDAM

Abstract

Faculty of Science Informatics Institute

Master of Science in Computational Science

(Semi-)automatic fetus segmentation and visualisation in 3D ultrasound: a promising perspective

by Romy Meester

Ultrasound imaging is a widely used medical imaging technique to examine the fetus during pregnancy. Sonographers use the clinical images to determine the well-being of the fetus. However, interpreting these images is skill and experience dependent, which leads to a subjective diagnosis. This research investigates the development of an educational resource in order to train sonographers to recognise and detect anomalies in fetal ultrasonography. The tool should particularly visualise the fetus, at which the sonographers’ view will not be obfuscated by surrounding tissues they are not interested in. Since manually segmenting the fetus is time consuming, this research examines denoising filters, namely the Gaussian filter, median filter, curvature flow filter, and anisotropic diffusion filter, to enhance the ultrasound images for the analysed (semi-)automatic segmentation models to isolate the fetus. These models include the heuristic semi-automatic and fully semi-automatic watershed segmentation model, and the deep learning U-net. Three visualisation applications, namely volume rendering in VTK, Volume Viewer, and Exposure Render, are examined. Results on fetal ultrasound datasets in the first-trimester indicate that the anisotropic diffusion filter efficiently reduces speckle noise and potential artifacts. Of the heuristic models examined, the semi-automatic watershed segmentation model using original images as an input showed the best results. However, of all the models examined, the U-net with ReLu activation functions and smoothed input images segmented the fetus from the ultrasound images showed the most promising results. With the implemented segmented masks, the visualisations of Volume Viewer provide a promising perspective.

(5)

Acknowledgements

Working on this thesis would have been an even more challenging task without the support, guidance, and useful insight of many people whom I would like to thank for helping to make this thesis possible.

A special thank goes to my supervisor and examiner, Dr. Rob Belleman, for offering me the opportunity to join the UvA Visualisation Lab. Fascinated by his passion at his lectures of the course ‘Scientific visualisation and virtual reality’, I was convinced I wanted to contribute to research in the field of image processing and/or visualisation techniques. His enthusiasm was reflected in the way of always supporting me, especially in moments of doubt, giving insightful comment on my progress, and motivating me throughout the entire process of making this thesis.

I had the honour of cooperating with Jaco Hagoort and Marieke Buijtendijk, who are both employed as medical imaging specialists at the Amsterdam UMC, location AMC. In collaboration with Harsha Shah, of the Obstetrics and Gynaecology department of Imperial College London, they are part of the 3D ultrasound atlas research team. I want to thank you for your time and sharing your expertise during the multiple (online) meetings, and especially for mentioning interesting thoughts for improving this research. Besides that, I am very grateful to Harsa for ensuring approval for ultrasound datasets and providing these. Moreover special thanks goes to Marieke, Jaco and his student for converting and even manually segmenting the fetus in the datasets. Without them I would not have been able to evaluate my methods.

Many warm thanks go to my wonderful friends and family for their loving support, inspiring words, and amusing (online game) activities. Especially, I want to thank Steven and Natasja for their comfort and encouragement while having relaxing walks. I would also like to express my sincere thanks to Linda, Emma and Jasper for their company, running sessions, and for attending the online meetings we organised to encourage ourselves through difficult times. Most of all, a special word of thanks go to my parents, brother and sister for their unconditional love and encouraging words, knowing I would succeed. Even though being at the other side of the world, Amber, you still managed to provide me a great inspiration and motivation. Thanks for lending me your computer screen Floris. I really appreciated the unconditional patience, love, and cards sent by my parents with warm words and pickwick slogans, in order to overcome the challenges and shortcomings.

(6)

List of Figures

1 Where my story begins. . . ii

1.1 Non-isolated region images of the fetus. . . 3

1.2 3D Ultrasound Atlas of Fetal Development. . . 4

2.1 Schematic representation of the mirror artifact. . . 9

2.2 Transducer technologies used for 3D US acquisition. . . 10

3.1 U-net architecture. . . 13

3.2 3D Realistic Vue rendering mode image. . . 14

4.1 Volume visualisation in Paraview of the image conversion. . . 17

4.2 Image slice 300/600 in coronal view of dataset1 in conical shape. . . 18

4.3 Image slice 76/151 in coronal view of dataset1 in rectangular shape. . . . 18

4.4 Real, original datasets in conical shape. . . 19

4.5 Flowchart of the watershed segmentation models. . . 23

4.6 Intermediate results of the semi-automatic watershed segmentation. . . 25

4.7 Intermediate results of the fully automatic watershed segmentation. . . 26

4.8 Activation functions. . . 31

4.9 Flowchart of the VTK pipeline. . . 33

4.10 Mock-up of the developed VTK tool. . . 34

5.1 Experimental results of denoising filters. . . 44

5.2 Filtering results of dataset5. . . 46

5.3 3D plot of surface potential. . . 47

5.4 Boxplots of the metric results of the heuristic segmentation models. . . 49

5.5 Heuristic segmentation model results of dataset1. . . 50

5.6 Progress results of the training and validation set of the U-net with ReLU activation functions and the smoothed images as an input. . . 51

5.7 Qualitative satisfying segmentation results of dataset6 in the test set of the U-net with ReLU activation functions and smoothed input images. . . 54

5.8 Qualitative error prone segmentation results of dataset6 in the test set of the U-net with ReLU activation functions and smoothed input images. . . 55

5.9 VTK tool in a developing stage. . . 56

5.10 Ultrasound volumes rendered in VTK. . . 57

5.11 Ultrasound volumes rendered in Volume viewer. . . 58

5.12 Ultrasound volumes rendered in Exposure Render. . . 59

1 Voxel intensity distributions. . . 79

(9)

List of Figures viii

2 Structure of the data. . . 80

3 Bar plot of the metric results of the filters . . . 81

4 Filtering results of dataset1 . . . 81

5 Bar plot of the metric results of the heuristic models. . . 82

6 Progress results of the training and validation set of the U-net with ReLU activation functions and the original images as an input. . . 83

7 Progress results of the training and validation set of the U-net with ELU activation functions and the original images as an input. . . 84

8 Progress results of the training and validation set of the U-net with ELU activation functions and the smoothed images as an input. . . 85

(10)

List of Tables

2.1 Summary of ultrasound artifacts. . . 8

4.1 Parameters of the U-net. . . 30

4.2 Confusion matrix. . . 40

5.1 Quantitative assessment of the experimental filter analysis. . . 44

5.2 Quantitative assessment of the filter analysis. . . 45

5.3 Quantitative assessment of the heuristic segmentation models. . . 48

5.4 Quantitative assessment of the training set of the U-net. . . 52

5.5 Quantitative assessment of the test set of the U-net. . . 52

1 Metadata of the datasets. . . 80

(11)

Abbreviations

2D Two Dimensional 3D Three Dimensional

AMC Academic Medical Center CHD Congenital Heart Disease CNN Convolutional Neural Network CT Computed Tomography

CTF Color Transfer Function

DICOM Digital Imaging and Communications in Medicine DVR Direct Volume Rendering

DL Deep Learning

DPCS DICOM patient coordinate system DSC Dice Similarity Coefficient

ELU Exponential Linear Unit FCN Fully Convolutional Network FN False Negative

FP False Positive

GUI Grapical User Interface HD Hausdorff Distance IoU Intersection over Union LSI Linear Shift-Invariant

MSD Maximum Symmetric Contour Distance MCRT Monte Carlo Ray Tracing

ML Machine Learning

MPR MultiPlanar Reconstruction MRI Magnetic Resonance Imaging

(12)

Abbreviations xi

MSE Mean Squared Error OTF Opacity TransferFunction PDE Partial Differential Equation PSNR Peak Signal-to-Noise Ratio ReLU Rectified Linear Unit RNN Recurrent Neural Network ROI Region Of Interest

RQ Research Question RW Render Window

SITK Simple Insight Tool Kit

SSIM Structural Similarity Index Measure STIC Spatio-Temporal Image Correlation TGC Time Gain Compensation

TN True Negative TP True Positive

VTK Visualisation Tool Kit UMC University Medical Center US UltraSound

(13)

Chapter 1

Introduction

The development of medical image analysis techniques is ongoing because of the rapid technological advances in medical imaging and the emergence of new clinical applications [1, 2]. Technological progress has resulted in higher image resolution, an improved image quality, and more complex imaging techniques. This leads to the necessity for image processing analysis to become more powerful in order to deal with respectively larger datasets, higher information density of images, and more complex data. Specifically in ultrasound (US), additional imaging modalities have been established such as the ability to examine anatomy in three dimensional (3D) imaging, which is the main modality upon which this thesis focuses. Because of these developments, it is now possible to monitor fetal growth and assess its development during pregnancy [3, 4]. For instance, de Bakker et al. established a 3D digital atlas and a database of early human development to quantify the fetal growth and clarify current ambiguities [5, 6]. Further investigation from the same research group, located at the Amsterdam University Medical Centers (UMC) (formerly Academic Medical Center (AMC)) and Imperial College London, aims at developing a similar database with ultrasound images in order to assist in clinical practice. This ongoing project is known as the ‘3D Ultrasound Atlas of Fetal Development’ [7]. Elaborating on this, medical researcher and PhD student Buijtendijk [8] conducts subsequent research of Shah et al. [9] and focuses on presenting fetal echocardiography as an educational resource. The tool should enable sonographers to train and gain experience in ultrasound screening, especially in recognising and detecting anomalies. Therefore, tissues in the ultrasound images need to be attenuated in order to increase the visibility of structures of interest during training. To contribute to the research of Shah and Buijtendijk, the research in this thesis examines to what extent specific tissues, in particular the fetus, can be automatically isolated and visualised for an educational and scientific purpose, at which the sonographers’ view will not be obfuscated by surrounding tissues they are not interested in. The availability of an automated computational method

(14)

Chapter 1. Introduction 2

would provide an objective and consistent method to isolate the fetus and thus provide an unencumbered view with which sonographers can train to recognise and detect potential anomalies in fetal ultrasonography.

In this research, the use of 3D ultrasound imaging for visualising the fetus is especially preferred over other tomographic modalities, such as computed tomography (CT) or magnetic resonance imaging (MRI). Fetal MRI is increasingly performed, but often only after an abnormality is suspected on ultrasound [10]. Not only has ultrasound the advantage of allowing real time acquisitions, the technique is also relatively inexpensive [1, 3, 11, 12]. Although no ionising radiation exposure is associated with ultrasound imaging, unlike X-ray imaging as CT, ultrasound is considered safe but still entails risks. For instance, the ultrasound energy has the potential to produce biological effects on the body, i.e. heating the tissues slightly, or producing small pockets of gas in body fluids or tissues (known as cavitation) [11]. Those risks may increase with unnecessary prolonged exposure to ultrasound energy, or when untrained users operate the device. The latter produces ‘keepsake’ fetal videos, which subsequently led to the ‘Tom Cruise bill’. This legislation prohibits the sale of diagnostic ultrasound to anyone but appropriately licensed clinicians [13, 14]. Therefore, the imaging technique is generally considered safe when used prudently by appropriately trained health care providers [11]. Consequently, ultrasound is the preferred imaging technique as long as the sonographers are educated properly.

However, the interpretation of ultrasound images is very complicated and usually requires an expert throughout the examination [1]. In order to check the well-being of the fetus, measurements of the morphology and fetal biometry can be made [3, 4, 15, 16]. Although ultrasound imaging is a perfect method for analysing the fetal growth, the medical images often present noise and artifacts such as shadows, patient movements, and signal dropout, making the image quality and the resolution of the images low [1, 3, 12, 17, 18]. The distinctive artifacts are further explained in Section 2.1. When visualising the volume of a 3D ultrasound image, noise and other limitations in the imaging modality make it a challenging task compared to rendering of CT and MRI data [19]. Rendering is the process of generating an image from the data with the desired mixture of contrast, light, and transparency [20]. Illustrations of the aforementioned imaging modalities are shown in Figure 1.1. Particularly, CT and MRI use rendering algorithms that rely on initial classification of the data into different tissue categories [19]. For instance, regions can be extracted in CT with similar data values, known as Hounsfield units [21]. However, tissue classification methods in ultrasound imaging are lacking which makes it hard to isolate certain tissues. As a result, volume rendering does not provide the desired effect, and automatically identifying and isolating the fetus in ultrasound images becomes a challenging task.

(15)

(A) US. (B) CT. (C) MRI.

Figure 1.1: Non-isolated region images of the fetus obtained with US [21], CT [22], and MRI [23].

Therefore, image segmentation is necessary to obtain the fetal structures in ultrasound. Segmentation is an image processing technique with the aim of delineating a region of interest (ROI) from the image [1, 3]. According to research imaging technician Hagoort [8], in the current segmentation procedure of ultrasound the expert manually segments the fetus, or the specialist needs to apply an editor tool and thereafter correct the segmentation himself before acquiring the fetal structures. In consequence, recent studies show that the quality and the reproducibilty of the biometry measurements are user-dependent [4]. Hence, this research investigates (semi-)automatic segmentation approaches where the distinction is that semi-automatic segmentation requires little user intervention, whereas fully automatic segmentation needs no user intervention at all [24]. The existing body of research on a variety of segmentation methods recognises the importance of analysing and improving ultrasound image segmentation techniques. In Section 3.1, different heuristic (semi-)automatic image segmentation methods are dis-cussed, such as level set or region-based segmentation. Recently, in order to develop an automatic segmentation method, machine learning has been introduced in medical image analysis. Specifically, deep learning methods such as convolutional neural network (CNN) architectures have been used for medical image quality classification [17, 25–27]. Besides that, fully convolutional networks (FCN) were used [28], but also deep neural networks such as DeepVNet [29]. In Section 3.2, the deep learning approaches are described in detail. In this research, the U-net has been investigated which uses a FCN [30].

For visualising the resulting volumes of the ultrasound images, several applications such as Crystal Vue, HDlive and Realistic Vue already exist [9, 31]. Although these software provide a novel way of visualising fetal anatomy, the use of these rendering techniques in clinical practice is limited due to the lack of an appropriate reference standard for the interpretation of the images obtained using these applications [7]. The 3D ultrasound atlas of fetal development analyses the fetal images with Crystal Vue and Realistic Vue, of which an example is shown on the left of Figure 1.2.

(16)

Figure 1.2: 3D Ultrasound Atlas of Fetal Development. On the left, a fetus visualisation of Crystal Vue is shown. On the right, a 3D ultrasound atlas of the fetus is illustrated.

Derived from [7].

This research examines (semi-)automatic segmentation methods, particularly heuristic methods, and the U-net to produce the visualisation as on the left of Figure 1.2. In the next section, the research scope, requirements and challenges are discussed in further detail, as well as the research questions and outline of this thesis which results in a promising perspective for analysing 3D ultrasound images.

1.1 Research scope, requirements and challenges

This research focuses on clinical obstetrical sonography images obtained with a Samsung Medison 3D converter (WS80A Elite system [32]). For this study, the scanner acquires 3D ultrasound images in order to analyse the morphology and anatomy of the fetus, and not blood and myocardial velocities [2]. The ultrasound images are assumed to contain a single fetus, not a twin or multiple fetuses.

After conducting an interview at the Amsterdam UMC (location AMC [33]) with Bui-jtendijk and Hagoort [8], respectively a medical researcher and PhD student, and a research technician and image processing research analyst, a number of requirements came to light. They mentioned that they want to implement an online educational resource that presents fetal echocardiography in a more approachable and intuitive way. Because manual fetal heart segmentations are hard to obtain, this study examines the whole fetus in obstetric ultrasound. The main reason to include an online educational tool is to improve education and training to enhance the performance of population-based screening. Since there is room for improvement in educational medical training, this research focuses on developing a model which contributes to provide meaningful visualisations of obstetric ultrasound. Possibilities of an application on a laptop are analysed, at which presumably a huge amount of storage memory is needed because of the large datasets. In addition, the developed code is required to be open-source.

(17)

The first challenge which is addressed is to segment the fetus from an ultrasound scan. Although many research has already been done in automatically delineating the region of interest (ROI), results are still imprecise [1]. The second challenge is to obtain visualisations that are similar to those produced by Realistic Vue, HDlive or Crystal Vue. The outcome of these applications show favorable images of fetus surfaces, but the software is limited used in clinical practice due to the lack of an appropriate reference standard [7]. With the help of the segmented data, the aim is to reproduce similar visualisations as using applications such as Realistic Vue, HDlive or Crystal Vue.

1.2 Research questions

Based on the previously mentioned requirements and challenges, the aim is to design and develop algorithms to objectively segment the fetus from the obtained 3D ultrasound images, and to apply rendering techniques to visualise obstetric ultrasonography for a scientific purpose as well as an educational resource. The algorithmic framework should be fully automatic and the methodology and code should be open-source.

To be able to objectively segment the fetus, the purpose of this research is to develop an algorithm that substitutes manual segmentation but that is of at least the same image quality. Throughout this paper, segmenting the fetus in an objective manner includes using algorithms instead of manual segmentation procedures, which causes consistency for scientific visualisation purposes. The developed image segmentation algorithms will be used in the visualisations of the educational resource in order to improve education and training to enhance the performance of population-based screening. As a consequence, to address the gap in understanding segmentation algorithms for 3D ultrasound imaging, this study pursues the following research question:

Research question (RQ1)

To what extent are the segmentation models capable of accurately and auto-matically segmenting the fetus in a 3D ultrasound image?

In addition, specific sub-questions are formulated to systematically find an answer to this research question. The preprocessing methods used in this research are image filtering techniques with the aim to support the models isolating and contouring the structure of the fetus as accurate as possible (RQ1.1). Furthermore, the ultrasound images contain several tissue structures which include not only the fetus itself (i.e. the ROI), but for example also the uterus and amniotic sac. Therefore, it is necessary to accurately segment the fetus from the ultrasound images (RQ1.2). All in all, this leads to the sub-questions stated below.

(18)

RQ1.1 Which preprocessing method prior to (semi-)automated analysis can best enhance the ultrasound data?

RQ1.2 Which (semi-)automatic segmentation model is capable of accurately segmenting the fetus?

In this research, the term ‘best’ is defined as receiving the most satisfying metric values during evaluation. The same applies to the term ‘accurate’ at which several evaluation measurements will be subjected to analyse the outcome. After segmenting the fetus, rendering techniques are investigated in order to visualise the resulting 3D ultrasound image. Therefore, this study pursues another research question:

Research question (RQ2)

To what extent can the fetus segmentation be used as an input data source for volume rendering techniques for visualisation in an educational context?

The aim of the segmentation methods developed by this research is to meet or even surpass the manual segmentations received from the Amsterdam UMC. Besides that, the educational resource needs to be suitable for trainee sonographers, meaning that the tool must produce accurate and unbiased visualisations.

1.3 Outline

This study aims to develop a visualisation technique for an educational resource as well as a scientific purpose that presents obstetric ultrasonography with the help of (semi-)automatic fetus segmentation models. Therefore, this paper is organised in the following way. In the next Chapter (see Chapter 2), the functioning of ultrasound and data acquisition are introduced. In Chapter 3, a more detailed description of image segmentation techniques is given. Also several applications such as Crystal Vue, Realistic Vue, and HDlive are discussed. In Chapter 4, the dataset specifications are explained. Moreover, noise reduction filters are introduced to enhance the ultrasound images. Besides that, the proposed heuristic models, namely a semi-automatic watershed segmentation model, and a fully automatic segmentation model, are considered. Also the U-net and the applied volume visualisation tools are analysed. Additionally, the experiments, evaluation metrics, and the software used are presented. Chapter 5 contains the results of the algorithms, models, and applications with the corresponding experiments and analysis. The key findings, scope and limitations, and future work are discussed in Chapter 6. In Chapter 7, a conclusion has been given, at which it is emphasised how this research contributes to improve the current understanding of 3D ultrasound visualisation of the fetus during pregnancy, which provides a promising perspective.

(19)

Chapter 2

Theoretical background

In order to understand obstetric ultrasonography, the functioning of ultrasound is discussed in Section 2.1. In particular, the physics of the acoustic waves and commonly encountered artifacts are explained. In Section 2.2, the data acquisition for a volume is described. With regard to the generated data, possibilities are presented on how the volume can be visualised.

2.1 Physics of ultrasound

The human ear can hear sound waves from around 20 Hz to about 20,000 Hz. A frequency above this sonic range is called ultrasonic, or ultrasound [2]. In order to develop an ultrasound image, a propagating wave partially reflects at the interface between different tissues [2]. If these reflections are measured as a function of time, information is obtained on the position of the tissue if the velocity of the wave in the medium is known. However, besides reflection, other effects can occur due to tissue interaction, movements of the mother and/or fetus, or deficiencies from the application itself, for example signal dropout [18]. When ultrasound propagates through matter, in particular diffraction, refraction, attenuation, dispersion, or scattering of the wave may appear. Due to the effects of such tissue interaction, artifacts in the image arise such as shadows, mirroring of structures, or reverberation. These ultrasound artifacts are summarised in Table 2.1.

Acoustic enhancement is a characteristic artifact of fluid-filled structures such as the gestational sac [34]. The fluid attenuates the sound less than the surrounding tissue, resulting in brighter structures in the deeper areas. Additionally, a shadow in the image happens most frequently with solid sound-absorbing structures, since sound conducts most rapidly in areas where molecules are closely packed, such as in bone, or in reflective

(20)

Chapter 2. Theoretical background 8

Table 2.1: Summary of ultrasound artifacts.

Artifact Description Effect on image Acoustic

enhancement

The time gain compen-sation (TGC) overcom-pensates through the fluid-filled structure caus-ing deeper tissues to be brighter.

An increased echogenicity (whiteness) posterior to the

fluid-filled area.

Acoustic shadowing

A signal void behind struc-tures that strongly reflect or absorb ultrasonic waves.

A darker region that resem-bles a shadow behind the structure.

Mirror image

The US beam repeatedly bounces back and forth be-tween (highly reflective) tis-sue boundaries and then re-turning to the receiver.

A reflection of entire organs, i.e. a mirrored copy appear-ing on the other side of the reflective surface.

Reverberation The US beam bounces back and forth between two (highly reflective) tis-sue boundaries and then re-turning to the receiver.

A reflection of tissue inter-faces, i.e. additional ‘rever-berated’ images in a deeper tissue layer.

Refraction The difference in propaga-tion speeds between the two tissues can cause change in the direction of the wave.

An echo displayed at an in-correct location.

structures, such as in air. A duplication of the uterus or the gestational sac, e.g. a ghost twin, arises from a mirror image artifact [35]. Reverberation happens in a similar way, except that the artifact results in multiple equally spaced lines along a ray line. Moreover, a refraction artifact can occur when a transmitted ultrasound pulse strikes an interface at a non-perpendicular angle. Since the receiver assumes all echoes have traveled along a direct path, the echo is displayed at an incorrect location in the image. Routine screening of obstetric US ensures specialists to be trained to identify and recognise artifacts in the image, and interpret images correctly for diagnosing fetal anomalies. For instance, ultrasound signals that are sent and received by a probe or transducer can cause a mirroring effect of structures as illustrated in Figure 2.1. An ultrasound probe or transducer measures the wave frequency as a function of time. When some ultrasonic waves bounce back and forth, arriving later than the original signals, another structure in the image occurs. A more detailed description of the functioning of a transducer is discussed in the following section.

(21)

Figure 2.1: Schematic representation of the mirror artifact. Ultrasound signals are normally reflected by the structure (fetal head), and by the reflective surface (posterior uterine wall and bowel wall) arriving on time to the transducer. Some ultrasound signals bounce back and forth between the head and the reflective surface finally arriving to the transducer. As they arrive later than the original signals, they are represented as

another structure behind the reflective surface. Derived from [35].

2.2 Data acquisition and visualisation

Ultrasound data originates from ultrasonic waves which are generated and detected by a piezoelectric crystal [2]. The crystal deforms under the influence of an electric field and, vice versa, cause an electric field over the crystal upon deformation. As a consequence, when an alternating voltage is applied over the crystal, a compression wave with the same frequency is generated. A device converting one form of energy into another form, which in this case is electric to mechanical energy, is called a transducer.

The transducer serves as both a transmitter and detector, and can only generate and receive a limited band of frequencies. This is called the bandwidth of the transducer. In order to acquire the ultrasound volumes of the fetus, this research focuses on transducers for 3D or ‘volume’ imaging. In Figure 2.2, different transducer technologies used for 3D ultrasound acquisition are illustrated.

Volume acquisition: Volume acquisition can be conducted with a mechanical 3D probe, a 2D matrix-array transducer, or by freehand scanning [36, 37]. In this research, a mechanical 3D probe was used to construct the 3D images. These probes and 2D matrix-array transducers play a major role in enabling easier and faster acquisition of

(22)

Figure 2.2: Transducer technologies used for 3D US acquisition: (a) mechanical 3D probes, (b) 2D matrix-array transducers, and (c) freehand 3D acquisition using a

conventional 1D array with position sensor. Derived from [36].

volume data [36]. A lower cost alternative to these 3D probes are freehand techniques. Furthermore, the resulting US volumes consist of a high number of 2D frames (or slices), of which the quality of the acquired volumes depends on the limitations inherent to the physics of ultrasound [20]. This has been discussed in the previous section.

Volume visualisation: Volume visualisation can be realised once an ultrasound vol-ume is acquired. The visualisation procedure can be categorised into either planar views or volume views [36]. Planar views are similar to conventional 2D ultrasound views, at which 2D cross-sections are displayed to the user. Generally, two types of planar views exist: orthogonal planar reconstruction and parallel planar reconstruction. With the former, three orthogonal planes (x, y, and z) are reconstructed and displayed to the user at the same time. This is usually denoted as multiplanar reconstruction (MPR), or multiplanar mode [20, 36]. In the case of parallel planar reconstruction, multiple sequential cross-sections that are parallel to each other are reconstructed and presented to the user simultaneously [36].

On the other hand, volume views integrate information from an entire volume into a single image [36]. Volume views are generated by casting a ray from each pixel onto the screen of a volume consisting of voxels, i.e. volumetric pixels. The ray collects the color and transparency until either the color is completely opaque, or the ray exits at the other side of the volume. The result is placed in the corresponding pixel on the screen.

(23)

Chapter 3

Related work

The technology of ultrasound imaging has already been used in clinical practice for more than half a century [2]. At present, it has become a usual phenomenon to examine a fetus during pregnancy with this medical imaging modality. In order to facilitate the fetal assessment, researchers attempted to optimise this procedure by generating techniques that automatically measure and detect anomalies [38]. Therefore, image segmentation methods are discussed in Section 3.1. Machine learning methods and specifically deep learning methods are considered in Section 3.2. In Section 3.3, progressive applications which visualise the fetus are discussed.

3.1 Image segmentation

A large and growing body of literature has investigated the efficacy of image segmentation techniques in obstetric ultrasound for analysing the fetus [3]. To automatically measure the morphology and fetal biometry, models have been developed to analyse characteristics such as the amniotic fluid volume, placental position, cerebellum volume, crown-rump length, biparietal diameter, and the circumference of the head or abdomen [3, 4, 15, 39]. Additionally, an estimation of the gestation age can be made by measuring the femur bone length [16]. A study by Ciurte et al. described a semi-automatic segmentation framework applicable to different ultrasound segmentation problems, namely the liver, eye, prostate, and fetal head [40]. Although Ciurte argues that a semi-automatic segmentation method alleviates the limitation of fully automatic segmentation method, that is, it is applicable to any kind of target and imaging settings, others (see [3, 39]) have highlighted the relevance of fully automatic segmentation methods which would be more effective for consistent and objective analysis, independent of the user.

(24)

Chapter 3. Related work 12

Considering fully automatic segmentation methods, examination of morphological opera-tors by Thomas et al. showed that it is possible to process fetal femur images based on shape characteristics [16]. Other heuristic segmentation studies on ultrasound images analysed level set and region-based segmentation models [3, 18], feature-based segmen-tations [24, 41], a texture-based approach [1], or implemented convex hull searching [42]. Segmentation also has been done based on statistical models, such as local image statistics [18], the maximum likelihood [43], gray level distributions of the structures [44], the local entropy [45], or selecting the threshold by minimizing the cross-entropy between the original image and the segmented image [46].

Despite that a wide variety of heuristic image segmentation techniques exist, the models are often intended for segmentations of specific fetal anatomical structures such as the femur [16] or head [40, 47], instead of delineating the whole fetal body. This research focuses on the latter with the aim to facilitate the fetal volume visualisation. In a survey of Noble and Boukerroui, several interesting studies implemented watershed segmentation on smoothed or prefiltered images [39], either by interactively selecting the ROI, or by automatically detecting seed points. However, these and many other algorithms were applied on 2D ultrasound images, whereas this research focuses on 3D US images. Besides heuristic segmentation models, machine learning techniques and especially deep learning methods can be applied in order to segment the fetus. This is explained in further detail in the following section.

3.2 Deep learning

Machine learning (ML) is the study of computer algorithms that learn from and make predictions on data. By making use of training data (i.e. the ground truth), machine learning has the capability to improve its own performance through experience. In deep learning (DL), which is a branch of machine learning, multiple hierarchical layers of a neural network are applied [48]. Deep learning methods such as the convolutional neural network (CNN) or fully convolutional network (FCN) have already been widely used in (medical) imaging analysis and computer vision [48, 49]. In the CNN model, each input image will pass through a series of convolutional filtering, pooling, and fully connected layers, in order to eventually classify the image into certain categories (e.g. in this research as fetus or background). A convolutional layer slides with a convolutional filter over the input which merges the input value and the filter value on the feature map, and pass the outcome to the next layer. The FCN is ‘fully convolved’ which implies that the network does not contain any ‘dense’ layers (as in CNNs), but instead it contains 1x1 convolutions that perform the task of fully connected layers (dense layers).

(25)

For automatic image segmentation in medical imaging analysis, a large number of studies has confirmed the effectiveness of using deep learning techniques with different deep network structures. Focusing on obstetric ultrasound image segmentation, several studies have examined the use of a CNN to automatically segment the fetal abdominal circumference from 2D US images [25, 50]. Besides that, Dormer et al. provide a CNN to detect a shape distortion of hearts, called congenital heart disease (CHD) [51]. However, this research conducted the study on rat hearts, and the researchers argue that more data is necessary to train this network [51]. Elaborating on this, Sundaresan et al. introduce a FCN to segment the the fetal heart views from the US frames, allowing the detection of the heart [52]. In another study, Yang et al. implemented a recurrent neural network (RNN) into the customised 3D FCN which simultaneously segmented multiple structures in the US volume, namely the fetus, gestational sac, and placenta [53]. Although this research perspective corresponds to the aim of image segmentation in this thesis, Yang et al. [53] acquired more than 100 volumes which were annotated by 10 experienced radiologists, while this thesis ‘only’ obtained seven ultrasound volumes with annotations. Therefore, training a 3D FCN would be limited with the acquired datasets.

Building upon a fully convolutional network, a U-net can be derived which is illustrated in Figure 3.1. A U-net promotes FCN by adding skip connections to merge feature maps from different semantic levels [53]. Skip connections are critical for the network to

Figure 3.1: U-net architecture (example for 32x32 pixels in the lowest resolution). Each blue box corresponds to a multi-channel feature map. The number of channels is denoted on top of the box. The x-y-size is provided at the lower left edge of the box. White boxes represent copied feature maps. The arrows denote the different operations.

(26)

recognise possible boundary details in ultrasound image. According to Ronneberger et al. (the developers of the U-net for biomedical image segmentation), the architecture of a

U-net has the advantage that it works with very few training images and yields more precise segmentations [30]. Various studies have assessed the efficacy of the U-net, either by using phantoms [54], or by medical images [25, 55]. Furthermore, many other neural networks exist for semantic pixel-wise segmentation such as DeepLab [56], DenseVNet (based on DeepVNet) [29], or LinkNet [57, 58]. Overall, these studies indicate that deep neural networks can be efficient for segmenting ultrasound images. The research in this thesis investigates a U-net in order to segment the fetus, since it is claimed the network performs well on a limited dataset.

3.3 Applications

Once an ultrasound image has been acquired, the data needs to be processed to visualise the volume in a meaningful way. Various application techniques exist to display and analyse the anatomical image as a plane or volume. For instance, Shah et al. obtained diagnostic 3D images of the first-trimester fetal ventricular system using Crystal Vue and Realistic Vue software [9]. Crystal Vue is a context-preserving postprocessing technique based on image-contrast enhancement for 3D fetal volumes [20, 59]. Realistic Vue is a software application that shows fetal anatomy in high-resolution 3D images [31]. A visualisation of a fetus developed in Realistic Vue is illustrated in Figure 3.2. As noted by Santana and Araujo J´unior, the 3D surface rendering approach made it possible to

Figure 3.2: 3D Realistic Vue rendering mode image of a fetus at 12 weeks of gestation, showing the hands, feet, and ears. Virtual light source position, 10 o’clock. Rotation,

(27)

obtain a clear view of the hands, ears, and position of the feet [31]. Furthermore, in the corner of the image the position and rotation of the virtual light source is shown. This shading technique is a common approach to improve the rendering in order to create more realistic visualisations, as for example in Crystal Vue [20, 21]. Additionally, HDlive also manages to demonstrate the embryonic development [31].

However, these software tools are difficult to learn and time consuming in the absence of specific training [20]. Besides that, as far as is known in this research, the precise operation of these visualisation methods is not described in the literature, probably to keep the competition out. Hence, this research develops an alternative approach based on a self-invented method. By delineating the boundaries between soft tissue and anatomical structures, as for example is done in Crystal Vue with image-contrast enhancement [31], the visualisation of the ROI becomes clearer. Therefore, in this research, automatic segmentation techniques support the purpose of using volume rendering techniques for unbiased visualisations of the fetus for an educational resource.

(28)

Chapter 4

Methodology

The aim of this study is two-fold: (1) to develop an objective way to segment the fetus from the images, and (2) to analyse the effectiveness of using the resulting ultrasound images to produce an unobstructed visualisation of the fetus. The characteristics of the ultrasound datasets that are used in this study are described in Section 4.1. Several preprocessing algorithms, namely filters for reducing noise in images, are discussed in Section 4.2. In Section 4.3, two (semi-)automatic segmentation models are proposed. Besides analysing these heuristic models, a U-net is investigated in Section 4.4. After segmenting the fetus from the image, three volume visualisation techniques are examined in Section 4.5. In Section 4.6, the experiments and evaluation metrics are discussed in order to analyse the performance of the models. The software tools are described in Section 4.7. The code of this research is publicly available at Github1. A workflow consists of the following phases:

1. (Optional) preprocessing with denoising filters 2. Segmentation; one of the following:

a. Heuristic segmentation models, or

b. Deep learning segmentation approach: U-net 3. Volume visualisation

The dataset specifications are discussed in the following section, after which the phases are explained in sequence.

1

https://github.com/RomyM08/Thesis_fetus

(29)

Chapter 4. Methodology 17

4.1 Dataset specifications

The ultrasound datasets that are used in this study originate from Imperial College London [60], and are prepared at the Amsterdam UMC (location AMC [33]) to make them applicable to this research. This process is explained with visualisations of a dataset in Section 4.1.1. The other datasets are discussed in Section 4.1.2. A description of the metadata and the structure of the data is given in Appendix A.

4.1.1 Data acquisition and preparation

The data acquisition was realised with a Samsung Medison 3D converter (WS80A Elite system [32]). This scanner operated with specific settings including a curved linear ultrasound transducer geometry, mechanical beam steering, and an external transducer. The transducer measured the tissue intensities of fetuses of the first-trimester. As a result, a 3D image of the fetus was obtained and stored in a proprietary MVL file format. The files were converted into a DICOM format by image processing research analyst Hagoort [8] from the Amsterdam UMC. DICOM stands for ‘Digital Imaging and Communications in Medicine’ and is an international standard to store medical imaging information [61]. As illustrated in Figure 4.1, the specialist also manually cropped the conical shaped image to a rectangular shaped image which is cropped around the region of interest (ROI) in Amira 2019.2 [62]. In this way, the fetus can be analysed in a smaller region which is computationally more efficient than processing the original image. The illustration is in particular made in Paraview, which is an open-source, visualisation application.

Figure 4.1: Volume visualisation in Paraview of the conversion of the conical shaped 3D image (see Fig. 4.2) to a rectangular shaped 3D image (see Fig. 4.3) which is manually cropped around the fetus (i.e. the ROI). The original ultrasound data was

(30)

Figure 4.2: Image slice 300/600 in coronal view of dataset1 in conical shape. In the middle, the fetus is shown which is the region of interest (ROI). The dark region around the ROI is the uterus which is filled with fluid. The 3D image has a size of

600 × 600 × 600, unsigned 8-bits integers.

In Figure 4.2, the middle slice of the original volume in conical shape is shown. The same coronal view in a rectangular shaped volume with the fetus as the region of interest (ROI) is illustrated in Figure 4.3. Specifically, the cropped, original image is displayed in Figure 4.3A. From this volume, a student manually annotated the voxels belonging to the fetus under expert supervision. An annotation is generated in Amira 2019.2 [62] by using the ‘segmentation editor’ in which the student manually segments each slice. The annotation was refined and checked by the specialists Hagoort and Buijtendijk [8] from the Amsterdam UMC. The annotated images serve as the ground truth at which the voxels either are white representing the fetus (labelled as True or 1) or black (labelled as False or 0) in the image masks. An example of a generated mask is illustrated in Figure 4.3B. Ground truth data is used to evaluate the performance of the filters, segmentation models, and visualisations. The result of applying the binary mask over the original image is shown in Figure 4.3C.

(A) Original image. (B) Annotated mask of fetus. (C) Result inverted mask.

Figure 4.3: Image slice 76/151 of size 141 × 115 in coronal view of dataset1 in rectangular shape. The ROI shows (A) the original image, (B) the annotated mask of the fetus as the ground truth, and (C) the resulting image when applying the fetus

(31)

Another preparation step is considered besides cropping the real, original datasets with conical shapes into rectangular shapes around the fetus. It was examined whether the cropped datasets should be normalised in order to make sure each 3D image contains the same voxel intensity range. In Appendix Figure 1, the range of the voxel intensity in all images already contain values between 0 and 255, so no image normalisation is needed for the scalar range.

4.1.2 The other datasets

Besides the real, original dataset1 which is shown in Figure 4.2, six other datasets were provided which are illustrated in Figure 4.4. In comparison with dataset1, three other datasets, namely dataset2, dataset3, and dataset4, contain different fetuses with the same gestation (i.e. first-trimester), but with a comparable position and magnification as the former. According to Buijtendijk [8], the wider the conical shape, the more the size of the pixels differ. Moreover, dataset5, dataset6, and dataset7 include the same fetus as dataset1, but in a different position. Shah [8] argues that the magnifications are the same, but the ROI box may be smaller/bigger when the volume was acquired. Hence, the volume may appears to be more ‘zoomed in’, but is in fact not.

(A) Dataset 2. (B) Dataset 3. (C) Dataset 4.

(D) Dataset 5. (E) Dataset 6. (F) Dataset 7.

Figure 4.4: Real, original datasets in conical shape. Of each 3D image, slice 150/300 is shown of size 300 × 300. All the datasets are sliced in the same way as in Figure 4.2.

(32)

4.2 Preprocessing: noise reduction filters

A number of image processing techniques have been developed to reduce noise and image artifacts to facilitate diagnosing medical images. In order to diminish the effects of noise and artifacts on the quality of the image as discussed in Section 2.1, noise reduction filters can be applied. One type of noise is speckle noise which causes a granular pattern in the image, and has an adverse effect on the contrast and resolution [63–65]. Therefore, speckle noise should be eliminated before applying other image processing techniques such as segmentation. Overall, the purpose of the denoising techniques is to decrease the noise in the (ultrasound) images while minimizing the loss of original features [66]. More specifically, a filter to reduce speckle noise has three main goals [65]. Firstly, noise in uniform regions needs to be eliminated. Secondly, the purpose is to preserve and enhance edges and image features. Thirdly, the filter should provide a better visual appearance. However, a balance needs to be made among these requirements, because it is impossible to completely reduce the speckle noise while still maintaining a good visual appearance [65]. In Section 4.6.1, an experiment and the evaluation metrics for the effects of the filters are explained. This study investigates denoising filters that have already been applied by other researchers on ultrasound images, namely the Gaussian filter [67, 68], median filter [64, 67], anisotropic diffusion filter [63, 64], and curvature flow filter [69]. These filters are sequentially discussed in Section 4.2.1, Section 4.2.2, Section 4.2.3, and Section 4.2.4, respectively.

4.2.1 Gaussian filter

The Gaussian filter smooths the images by applying a convolution on the images with a Gaussian kernel [67, 68]. Convolution is a neighbourhood operator that is linear shift-invariant (LSI), implying that the filter behaves the same way over the entire image. The filter is applied by sliding a kernel over the image. By multiplying together the kernel value and the underling pixel value for each of the pixels in the kernel, and then adding all these numbers together, the output image is computed [68]. The size of the input and output image remains the same. However, the edges in the image are not preserved, since the filter does not consider the difference of pixel intensities [67]. The Gaussian function in one dimension x is defined as:

G(x) = 1 σ√2πe

−x2

2σ2, (4.1)

where σ is the standard deviation of the Gaussian distribution. The amount of the image that is smoothed is related to the standard deviation, i.e. the larger σ, the more the

(33)

image is smoothed [67]. The 3D Gaussian distribution has the form N · exp−x2+y_2σ22+z2

, at which the constant N depends on the number of variables n, i.e. N = _σn_(2π)1(n/2).

Hence, when n = 3 for a Gaussian filter applied to a 3D image, the normalisation constant becomes N = 1

σ3_(2π)(3/2). This ensures that the kernel of the 3D Gaussian filter has the

form: G(x, y, z) = 1 σ3_(2π)(3/2)e −x2+y2+z2 2σ2 , (4.2)

where x, y and z are the Cartesian coordinates, and σ is the standard deviation of which is assumed that σx= σy = σz≡ σ.

4.2.2 Median filter

The median filter is a non-linear spatial filter that smooths the images by replacing the value of every pixel with the median value of their neighbours [64, 67]. A median filter can be used to reduce speckle noise due to its robustness and edge preserving characteristics [64]. The median filter for a 3D image can be expressed as:

ˆ

f (x, y, z) = median

k,l,m∈Sx,y,z

{g(k, l, m)} , (4.3)

where ˆf (x, y, z) represents the filtered pixel at location (x, y, z) and g(k, l, m) represents the noisy pixel in the neighbourhood area Sxyz. The neighbourhood of surrounding pixels

is defined by the radius r [69].

4.2.3 Anisotropic diffusion filter

The anisotropic diffusion filter reduces noise in images without blurring the edges. The filter iteratively diffuses images by solving non-linear partial differential equations (PDE) based on the classical heat equation [64]. The more iterations performed, the more diffused the image will become. The partial differential equation with an initial condition which the anisotropic diffusion filter uses to modify the noisy image can be expressed as:

(_∂I

∂t = div(c(k∇Ik) · ∇I)

I(x, y, z, 0) = g(x, y, z) (4.4) where div represents the divergence operator, c(k∇Ik) the diffusion coefficient, ∇ the gradient operator on image I, t the artificial time parameter, and g(x, y, z) the noisy

(34)

image [63, 64, 70]. The solution of the above equation produces I(x, y, z, t) which is the filtered image at time t. The time step is stable when t ≤ 0.5/2N, where N is the dimensionality of the image. Thus, valid time steps for a 2D image are below 0.125. For a 3D image, time steps below 0.0625 are stable. Moreover, Perona and Malik [71] developed the Perona-Malik model that solves the problem of edge smoothing by introducing the anisotropic diffusion coefficient c(k∇Ik) which contains the diffusion or flow constant κ [64, 70]. The behaviour of the filter (i.e. the amount of diffusion) is influenced by κ which is the edge magnitude parameter or so-called smoothing parameter which controls the conduction as a function of the gradient [64, 70]. If κ is low, the small intensity gradients are able to block the diffusion across the step edges. If κ is large, the diffusion can overcome the small intensity gradient barriers, reducing the influence of intensity gradients on diffusion [64]. All in all, the filter encourages diffusion (hence smoothing) within regions with the same pixel values and prohibits it across strong edges. Hence the edges can be preserved while removing noise from the image.

4.2.4 Curvature flow filter

The curvature flow filter denoises an image using curvature driven flow. Curvature is a measure of how fast a contour is changing direction [69]. The image is smoothed by an amount proportional to the local curvature of the image intensity (i.e. the brightness). The iterative equation of image I is:

Ii+1(x, y, z) = Ii(x, y, z) + ∆t · Ci(Ii(x, y, z)), (4.5)

where the coordinates (x, y, z) are the pixel indices along the dimensions, i indicates the number of iterations, ∆t shows the time step, and Ci is the curvature force along the

surface normal [69]. In a similar way as the anisotropic diffusion filter, the filter uses a level set formulation where the iso-intensity contours in an image are viewed as level sets, where pixels of a particular intensity form one level set. The level set function is then evolved under the control of a diffusion equation where the speed is proportional to the curvature of the contour:

∂I

∂t = κ|∇I|, (4.6)

where κ is the curvature of iso-contours in the image which regulates the diffusion, and ∇I the gradient operator on the image [69]. The advantage of this approach is that sharp boundaries are preserved with smoothing occurring only within a region.

(35)

4.3 Heuristic segmentation models

Many researchers have utilised heuristic segmentation techniques to delineate regions from ultrasound images in order to analyse fetal characteristics. In the study of this thesis, the watershed segmentation algorithm is implemented in a semi-automatic segmentation model and a fully automatic segmentation model for 3D ultrasound images. In the former, the user manually selects pixel points in the image which are defined as the (initial) seed points or the so-called user-defined markers for indicating the connected component and labelling the segmentation regions. In the latter, the model needs no user intervention. Instead, algorithms are used to define the specific regions. One advantage of a watershed algorithm above all other methods mentioned in Section 3.1, is that this region-based segmentation method is capable of segmenting specific (fetal) anatomical structures. Therefore, the algorithm is widely used in medical image processing [39]. The watershed algorithm is a region-based algorithm that separates different objects in a greyscale image based on the local topography. The brightness of each pixel represents the height in the topographic map where a high intensity denotes hills and a low intensity denotes valleys. Each isolated valley (local minima) is filled with different coloured water. As the water rises, water from different valleys will start to merge. Based on the specified water level of the MorphologicalWatershed, the pixels are labelled in the image. This results in image segmentation.

Figure 4.5: Flowchart of the watershed segmentation models. The start of the flowchart is presented in blue with the input image, the processes are shown in green rounded rectangles, and the resulting mask is illustrated in a red parallelogram. The data flow is illustrated with the solid line. The semi-automatic watershed segmentation model works through the use of seed points specified by the user. The fully automatic watershed segmentation model performs these operations with algorithms. Therefore, the overall flowchart of both models is the same, except the way in which the connected component is defined (see Fig. 4.6E, and Fig. 4.7E), and how the resulting mask is composed (see

(36)

In Figure 4.5, the flowchart for both the semi-automatic and the fully automatic watershed segmentation model is shown. Overall, both models perform the same steps throughout the process, except the parts in which either seed points are specified by the user (i.e. the semi-automatic model), or an algorithm is applied (i.e. the fully automatic model). At the start of the models, an input image is implemented which is either the 3D volume of the original ultrasound image, or the smoothed image. As discussed in Section 4.2, the smoothed image is created with the filter that best enhance the ultrasound image by reducing the noise and potential artifacts in the image. The evaluation methods for the models are described in Section 4.6.2, and the robustness of the model becomes clear in Section 5.2. In Section 4.3.1, the semi-automatic watershed segmentation model is explained. Subsequently, the fully automatic model is elaborated in Section 4.3.2.

4.3.1 Semi-automatic watershed segmentation

The developed semi-automatic watershed segmentation model aims to isolate the fetus from the image similar to the ground truth of the expert. In this method, the user specifies the seed points for each dataset beforehand. By sliding thought the volume in 2D slices, the user can define the seed points and manually implement these in the code. The overall procedure of this model is described in Figure 4.5, at which the intermediate results of dataset1 are shown in Figure 4.6.

The first step in this process is to compute the gradient magnitude (Fig. 4.6B) of the input image (Fig. 4.6A) by convolution with the first derivative of a Gaussian. After that, the MorphologicalWatershed is applied (Fig. 4.6C) to the gradient magnitude of the input image, since these values indicate characteristic regions. The component seed point is specified by the user in a dark region of the embryonic fluid in the input image (Fig. 4.6D). All the pixels that are present in this region of the watershed segmentation are not part of the connected component (Fig. 4.6E), which is shown in green over the input image. A binary image (Fig. 4.6F) is created from this segmented region in which possible holes have been filled in the foreground (i.e. the white region). Subsequently, a distance map is realised (Fig. 4.6G). The signed maurer distance map calculates the Euclidean distance transform of the binary image in linear time. The inside is considered as having negative distances, and the outside is treated as having positive distances. Furthermore, another watershed segmentation (Fig. 4.6H) is applied to show the specific regions of the distance map. Based on both the connected component and the watershed segmentation (Fig. 4.6E and 4.6H, respectively), a labelled mask is created in which the bright pixel values of the connected component are classified in regions specified by the watershed segmentation. These labelled regions are in different colours visualised over the input image (Fig. 4.6I). Then, the user manually defines the label seed points which

(37)

(A) Input image. (B) Gradient Magnitude. (C) Watershed. (D) Seed point.

(E) Connected component.

(F) Binary. (G) Distance map. (H) Watershed.

(I) Labelled mask. (J) Defined regions. (K) Resulting mask.

Figure 4.6: Intermediate results of the semi-automatic watershed segmentation. Slice 76/151 of dataset1 with an image size of 141 × 115 is shown. In order of the methodology: (A) the input image which is the original ultrasound image or the smoothed image, (B) the gradient magnitude, (C) the morphological watershed, (D) the seed point in the input image, (E) the connected component, (F) the binary image, (G) the distance map, (H) the morphological watershed, (I) the labelled mask, (J) the defined regions, and (K) the resulting mask are shown. Seed points are illustrated as red dots in the images.

selects the regions that are part of the fetus. The defined seed points and corresponding regions are in different colours shown over the input image (Fig. 4.6J). The resulting mask is created by merging all the defined regions into a binary image (Fig. 4.6K). In this way, over-segmentation is resolved and a fetal mask is created.

4.3.2 Fully automatic watershed segmentation

The difference with the previous segmentation model is that no seed points are specified by the user. Instead, the model makes use of an algorithm that automatically determines the seed points being used. For example, Xian et al. also developed their own seed point generator for a fully automatic segmentation [24]. Besides that, the model automatically defines the regions to be segmented. An overview of the procedure of the fully automatic watershed segmentation model developed in this research is shown in Figure 4.5.

(38)

The corresponding intermediate results of dataset1 are shown in Figure 4.7. The beginning of this model is the same as with the semi-automatic segmentation model. First, the gradient magnitude (Fig. 4.7B) of the input image (Fig. 4.7A) is computed. Additionally, the morphological watershed segmentation is applied to the gradient magnitude image (Fig. 4.7C). A seed point generator is developed on the input image (Fig. 4.7D) in order to obtain the connected component (Fig. 4.7E). This algorithm is shown in Algorithm 1. To begin with, a random seed point is picked in the middle 2D slice in the z-direction of the input image (i.e. in coronal view in Figure 4.7). When the value of this pixel and its corresponding 8-connected pixel neighbours satisfy the specific threshold value, the implemented seed point has been found. The assumptions have been made that the fetus lies in the middle of the image, and is always surrounded by a significant amount of embryonic fluid volume. This dark region in the image has a low pixel intensity. Therefore, the threshold has been set to 10 which indicates that the mean pixel intensity

(A) Input image. (B) Gradient magnitude. (C) Watershed. (D) Seed point.

(E) Connected component.

(F) Binary. (G) Distance map. (H) Watershed.

(I) Labelled mask. (J) Defined regions. (K) Resulting mask.

Figure 4.7: Intermediate results of the fully automatic watershed segmentation. Slice 76/151 of dataset1 with an image size of 141 × 115 is shown. In order of the methodology: (A) the input image which is the original ultrasound image or the smoothed image, (B) the gradient magnitude, (C) the morphological watershed, (D) the seed point in the input image, (E) the connected component, (F) the binary image, (G) the distance map, (H) the morphological watershed, (I) the labelled mask, (J) the defined regions, and (K) the resulting mask are shown. Seed points are illustrated as red dots in the images.

(39)

Algorithm 1: Seed point generator

Result: seed point for generating connected component while seedfound == False do

compute possible seed point;

calculate the value mean pixel intensity of the seed point and its neighbours; if value < value threshold then

seedfound = True; end

end

of the specified pixel and its neighbours should be lower than that. Thus, the resulting seed point is most likely black, which is necessary to define the connected component. This procedure is slightly derived from the hit-or-miss Monte Carlo method which counts the number of ‘hits’ that are in the region that needs to be evaluated. In this research, it is only considered whether the ‘dart’ or random seed point is a ‘hit’ or ‘miss’ to use this seed point. If the seed point is a ‘hit’, the process of the model continues with this seed point. Otherwise, the algorithm keeps searching for a seed point that satisfies the conditions. Additionally, by implementing a counter in the random.seed(count * 3), the exact same simulation can be repeated, of which each time the same seeds will be searched when starting the simulation. Furthermore, the binary image (Fig. 4.7F), distance map (Fig. 4.7G), watershed segmentation (Fig. 4.7H), and the labelled mask (Fig. 4.7I) are applied in the same way as in the semi-automatic watershed segmentation model. In order to resolve the over-segmentation, specific regions need to be merged to only obtain the fetus in the mask. Therefore, the assumption has been made in the fully automatic watershed segmentation model that the fetus is not at the borders of the volume. Thus when a pixel of a labeled region is at the border of the image, this label will be removed from the list of labels which generates the fetus mask. As a consequence, the remaining labels indicate the pixels which are used in the fetus mask (Fig. 4.7J). The resulting mask is then composed of these regions (Fig. 4.7K).

4.4 Deep learning segmentation model: U-net

Several deep learning techniques currently exist to analyse medical ultrasound images [49]. The research in this thesis examines the U-net to segment the fetus from ultrasound images. Originally, Ronneberger et al. developed the U-net for biomedical image segmentation [30]. A major advantage of implementing the U-net is that the network operates with a small training set of images, and is therefore well applicable to the obtained ultrasound images. In the current study, four distinctive implementations of the network are investigated in which two features differ. That is, the input images either are the original ultrasound

(Semi-)automatic fetus segmentation and visualisation in 3D ultrasound: a promising perspective

University of Amsterdam

Master Thesis