Common carotid segmentation in 18F‐sodium fluoride PET/CT scans: Head‐to‐head comparison of artificial intelligence‐based and manual method

Abstract Background Carotid atherosclerosis is a major cause of stroke, traditionally diagnosed late. Positron emission tomography/computed tomography (PET/CT) with 18F‐sodium fluoride (NaF) detects arterial wall micro‐calcification long before macro‐calcification becomes detectable by ultrasound, CT or magnetic resonance imaging. However, manual PET/CT processing is time‐consuming and requires experience. We compared a convolutional neural network (CNN) approach with manual segmentation of the common carotids. Methods Segmentation in NaF‐PET/CT scans of 29 healthy volunteers and 20 angina pectoris patients were compared for segmented volume (Vol) and mean, maximal, and total standardized uptake values (SUVmean, SUVmax, and SUVtotal). SUVmean was the average of SUVmeans within the VOI, SUVmax the highest SUV in all voxels in the VOI, and SUVtotal the SUVmean multiplied by the Vol of the VOI. Intra and Interobserver variability with manual segmentation was examined in 25 randomly selected scans. Results Bias for Vol, SUVmean, SUVmax, and SUVtotal were 1.33 ± 2.06, −0.01 ± 0.05, 0.09 ± 0.48, and 1.18 ± 1.99 in the left and 1.89 ± 1.5, −0.07 ± 0.12, 0.05 ± 0.47, and 1.61 ± 1.47, respectively, in the right common carotid artery. Manual segmentation lasted typically 20 min versus 1 min with the CNN‐based approach. Mean Vol deviation at repeat manual segmentation was 14% and 27% in left and right common carotids. Conclusions CNN‐based segmentation was much faster and provided SUVmean values virtually identical to manually obtained ones, suggesting CNN‐based analysis as a promising substitute of slow and cumbersome manual processing.


| INTRODUCTION
Atherosclerosis is the origin of major cardiovascular and cerebrovascular diseases, (Lorenz et al., 2006;World Health Organization, 2019) which are the number one cause of mortality worldwide despite advancements in diagnostic and therapeutic measures (Barquera et al., 2015;World Health Organization, 2019). Atherosclerotic changes develop in childhood, but rarely cause symptoms until adulthood, in men from age 40−45, in women with a 10-year delay (Enos et al., 1955;Holman et al., 1958;McGill, 1968). Carotid artery disease is a major cause of stroke, accounting for about 20% of all cases. Carotid artery disease can cause a stroke or transient ischemic attack (TIA) in three major ways: (a) a plaque narrows and completely blocks a carotid artery (total occlusion); (b) plaque rupture damages the lining of the artery with clot formation and finally thrombosis; (c) an emboli on the plaque breaks off and passes with the blood to the brain, where it blocks a brain blood vessel. All three cause an interruption in the blood flow to the brain and can result in symptoms of stroke or TIA (Chambless et al., 2000;Derlin et al., 2011;Wu et al., 2017). Carotid artery disease may be present without symptoms and is usually diagnosed in connection to a stroke or transient ischaemic attack, the identical symptoms of which include weakness in face or arms and speech difficulties. In patients with carotid artery disease, atherosclerosis may develop also in other arteries throughout the body (Lorenz et al., 2006). After the emergence of symptoms, atherosclerotic changes may be detected in plaque form with or without calcification, primarily using structural imaging modalities, such as conventional X-rays, ultrasound, CT and magnetic resonance imaging (Høilund-Carlsen et al., 2021). The diagnostic modalities are seldom utilized routinely in individuals with asymptomatic atherosclerosis and often fail to detect atherosclerotic plaque unless tissue change is relatively macroscopic (Prabhakaran et al., 2007).
Positron emission tomography (PET) allows detection of atherosclerosis by tracking microscopic tissue changes, before conventional imaging modalities can detect them (McKenney-Drake et al., 2018;Raynor et al., 2016). For example, 18 F-sodium fluoride (NaF) maps microcalcification (Derlin et al., 2011;Høilund-Carlsen et al., 2020;Sorci et al., 2020) and thus, NaF detects microcalcification, a crucial feature of atherosclerosis. However, PET imaging also has limitations. Analysing PET scans is a relatively time-consuming process depending on the target organ. However, artificial intelligence (AI) models in the shape of image analysis models, may overcome this limitation as observed in other diseases including cancer (Lindgren Belal et al., 2019;Dou et al., 2017). More specifically, large computational models called convolutional neural networks (CNNs) are time-efficient and successful approaches for automated volumetric CT scan segmentation (Mortensen et al., 2019;Polymeri et al., 2020).
In this study, we aimed to design and test an AI-based model to segment the common carotid arteries and examine whether it could segment faster than the manual approach and provide comparable data for tracer uptake, so that the AI-based approach as support or replacement may serve to increase the routine clinical use of NaF-PET for assessment of carotid atherosclerotic burden.

| Study design
CNNs were trained earlier to segment the carotids automatically. A single image analyser performed manual segmentation of the heart and aorta in 49 participants, primarily included as a part of the 'Cardiovascular Molecular Calcification Assessed by 18F-NaF PET/ CT' (CAMONA) study (Blomberg et al., 2014(Blomberg et al., , 2015(Blomberg et al., , 2017. The carotids were segmented in a way to encompass the artery wall and the inner blood pool. The accuracy of the automated segmentations was assessed by comparison with measurements obtained by manual segmentation in the same 49 subjects. We examined intra and interoperator variability with the manual approach by repeated manual segmentation in 25 randomly selected scans performed by the same operator and by two independent operators, respectively. The random subjects were selected using RandList software. The CNN-based segmentation procedure has an inborn 100% repeatability (Trägårdh et al.).

| Study population
The CAMONA study, conducted 2012-2014, included 89 healthy individuals with low cardiovascular disease risk recruited from the blood bank of Odense University Hospital or via local advertisement (Blomberg et al., 2017). Individuals considered healthy if they had no history of malignant diseases, immunodeficiency syndromes, autoimmune diseases, alcohol or substance abuse or cardiovascular diseases. They were preselected by age and gender to guarantee a balanced inclusion of both genders aged 20-29, 30-39, 40-49, 50-59, and 60 years or older. Also, 50 patients with suspected angina pectoris referred to the Department of Cardiology at Odense University Hospital for coronary angiography were included as the angina pectoris group. All original 89 + 50 subjects were invited to have a 2-year follow-up NaF-PET/computed tomography (PET/CT) scan. However, despite direct inquiries, only 29 healthy controls and 20 patients responded. It is their basic NaF-PET/CT scans, which constitute the material for the current assessment of the performance of CNN-based segmentation.

| Image analysis
For quantitative manual and automated analysis of the carotids, the Research Consortium for Medical Image Analysis (RECOMIA [https:// www.recomia.org/]) was used. The carotids were segmented from the origin (brachiocephalic artery for right and arch of aorta for left carotid artery) to the end of the bifurcation. VOIs were formed by stacking manually defined region of interests (ROIs) covering the whole carotid arteries in the CT images of each participant to segment the carotids.
The manual ROI determination contained the carotid arteries (artery wall and inner blood pool), excluding the vertebral bones and their uptake halo from the defined ROIs. Quantitative assessment was done by determining the segmented VOI volume (Vol) in ml and generating standardized uptake values (SUVs) for NaF uptake (in g/ml) in each VOI. SUVmean was the average SUV of all VOIs within VOI, SUVmax the highest SUV of all voxels in these VOIs, and SUVtotal the SUVmean multiplied in Vol of the VOI.

| CNN segmentations
For the automatic segmentation, a fully convolutional CNN with the same structure as the 3D U-Net (Çiçek et al., 2016) was trained. The 3D U-Net is designed to have a large receptive field while still being able to use high-resolution information. This is achieved by using max-pooling downsampling, upconvolution upsampling and skip connections to process the input image on four different resolutions.
The CNN takes a 100 × 100 × 100 Vol of voxels, where each voxel has a size of 3.0 × 1.37 × 1.37 mm, as input. For this input size, the CNN outputs an estimated class probability for each voxel of a 12 × 12 × 12 Vol at the centre of the input patch; in this case the classes are carotid left, carotid right and background.
For training, a set of 50 CT scans with annotations of the carotids from an external database was used. This data set was created during a previous project (Trägårdh et al., 2020) and the scans were separate from the 49 NaF-PET/CT scans used for the main study. These 50 scans were then divided into a training set of 40 scans and a validation set of 10 scans. The loss function used was categorical cross-entropy and the optimization was done using the Adam method (Kingma & Ba, 2014) with Nesterov momentum.
To produce the automatic carotid segmentations, the trained CNN was applied to the whole CT-scan resulting in an initial segmentation. For postprocessing, the largest almost connected component was extracted. Almost refers to the fact that a distance of 20 mm between small segmentation components is allowed for them to still count as connected. To avoid having areas of high activity originating from surrounding bones which may strongly influence the SUV statistics, SUV leakage removal was performed. In detail, areas with SUV above a threshold (two standard deviations (SD) above the mean SUV in the carotids) and in which the closest activation maximum was located in the bones, were removed from the segmentation. The segmentation of the bones was done using an additional segmentation tool available on the RECOMIA platform (Trägårdh et al., 2020).

| Statistical analysis
Frequency (percentage) and mean ± SD were used to express descriptive statistics. Bland-Altman plots were used to assess the agreement between variables in pairwise segmentations (Carkeet, 2015;Gerke, 2020). The mean differences (bias) and the upper and lower Limits of Agreement (LoA) were calculated for the two methods. The Sørensen−Dice coefficient (SDC) was calculated to gauge the similarity of CNN and manual segmentations in the common carotids segmentation (Zijdenbos et al., 1994).

| RESULTS
The mean age (±SD) of the subjects was 52 ± 12 years, ranging from 21 to 75. Twenty-six (53%) were male. The mean height and weight of the subjects were 173.1 ± 9.1 cm and 82 ± 20.5 kg. An example of CNN versus manual segmentation is shown in Figure 1 Table 2 and Table 3, respectively. The mean Vol deviation at repeat manual segmentation was 14% and 27%, respectively, in left and right common carotids. The mean SDC for left and right common carotids are shown in Table 4; the CNN versus manual SDC and Interobserver SDC were not statistically different in left (p = 0.66) and right (p = 0.59) common carotids.
F I G U R E 1 A three-dimentional reconstruction of manual (a) and CNN-based (b) common carotids segmentation in the same patient (right common carotid in light green and left common carotid in blue).

| Strengths and weaknesses of the study
An important practical limitation of quantitative PET studies is the manual or semiautomatic segmentation of the scans. These are timeconsuming processes requiring an experienced image analyser to define the VOIs in the scans, which alone may sometimes last half an hour or more when it comes to carotid artery segmentation. This is a major issue in many clinical and research studies and one of the reasons why we decided to turn to AI and deep learning, computerbased tools that have improved several aspects of diagnosis in the medical field. Thus, nuclear cardiology has used AI to facilitate attenuation correction (Irkle et al., 2015) or automate myocardial perfusion reports (Kashiwazaki et al., 2018). We developed and The most important strength of this CNN-based model was the ability to segment common carotids comparable to the manual segmentation, which is difficult even for trained image analysers.
Noncontrast CT used in hybrid imaging modalities, such as NaF-PET/ CT, is not optimal for studying cardiovascular structures since distinguishing different anatomic structures is difficult in the absence of intravenous contrast. Second, common carotids are anatomic structures prone to relatively large interindividual variation. Therefore, the ability of this CNN-based model to distinguish common carotids beside other similar structures such as jugular vein, lymphatic nodes and muscles was quite satisfactory.
The main limitation of the CNN-based segmentation was some inaccuracy of the segmented VOIs due to variation in the vascular system, especially the right common carotid, the origin of which from the brachiocephalic artery is rather difficult to identify. Manual segmentation of such variations could be challenging as well (Ntaios et al., 2021). The CNN-based model is much faster, but we cannot document that it is also more accurate than the manual one, since there is no infallible reference to compare with. We can only point to its superior reproducibility, observer-independence and apparently also relative independence of PET/CT scanner type and make (Boellaard et al., 2015;Boellaard et al., 2019;Hagiwara et al., 2020).
We expect that it is a matter of time before CNN-based segmentation will outperform the manual segmentation, for the simple reason that it will continuously learn and improve as more and more scans of patients with diverse disorders and variable anatomical structures have been examined for training purposes.
Finally, the proximity of high uptake structures, such as the sternum or vertebral bones, is another challenging factor, which we tried to correct for in different ways with the two methods. Also at this point, it is expected that the CNN-based methodology will take the lead based on multiple upcoming training examples and a neverending apprenticeship.

| Possible mechanisms and implications
The most probable explanation for the difference between CNNbased and manual segmentation is the similarity of density between T A B L E 3 Differences between two right common carotid segmentations: Intraoperator, Interoperator and manual-CNN variability

| Unanswered questions and future research
The present work was mainly a feasibility study elucidating if the CNN approach can segment the carotids from the non-contrast CT part of an ordinary PET/CT scan and yield SUVmean measures of NaF uptake comparable to those obtained manually. That this can be done in 1 min is a huge progress which opens for routine application.
However, to what degree it will impact clinically, only time and prospective interventional and longitudinal studies can show. It depends largely on whether arterial NaF uptake is a precursor of macrocalcification that is detectable by ultrasound and CT, as certain animal and human studies indicate (Høilund-Carlsen et al., 2020; McKenney-Drake et al., 2018). If so, it is foreseeable that the method will be applied in patients with suspected stroke/TIA, probably looking for NaF uptake not only in the carotids but in the entire preceding part of the arterial system.
It is unknown how well the AI-based approach will work in patients with major anatomical variations and to what degree it can result in reliable estimates of changes over time or due to intervention. The CNN-based model presented here is preliminary and probably the first to demonstrate feasibility of AI-based common carotid segmentation. The presented results were acquired after training on a very limited amount of learning material, a circumstance which gives reason to believe that the AI-based approach will after further training in more extreme cardiovascular cases gradually become the mainstay of image analysis in patients with suspected or known atherosclerotic disease.

| CONCLUSION
The new CNN-based model for automated segmentation of common carotids was fast and reproduced values for common carotid NaF uptake that were comparable to those acquired by manual segmentation. With increased ongoing learning we expect that the CNN-based processing of NaF-PET/CT scans will be a valuable time-saving addition to routine assessment of the atherosclerotic burden in the common carotids and other major arteries.

ACKNOWLEDGEMENT
This study was partly funded by the Odense University Hospital and the University of Southern Denmark, Odense, Denmark.