Objective image quality assessment in X-ray breast imaging
Breast cancer is the most common type of cancer in women. In the Netherlands, breast cancer screening has been implemented for women between 50 and 75 years old. Participating women get a mammogram consisting of four digital mammography (DM) images every two years. These mammograms are reviewed by two radiologists independently. On the images, breast cancer - among other manifestations - can appear as low-contrast soft-tissue lesions or very small calcifications. Depiction of these structures is technically challenging for the DM systems. Therefore, strict quality control (QC) procedures have been established to ensure a high physico-technical quality of the equipment used in the Dutch breast cancer screening program.
DM images undergo extensive vendor-specific post-processing in order to be suitable to be displayed on a diagnostic monitor. Physicists measure image quality on unprocessed images. So the post-processing is not taken into account in image quality assessments. This has become problematic in recent years due to the development of more advanced image processing techniques.
The post-processed images are the ones that are reviewed by the radiologists, where, based on this visual information, he or she needs to perform the clinical task of breast cancer detection. For a given case, the radiological task performance (detection or characterization of lesions) becomes then a measure of image quality. In addition, it is essential to perform an assessment of clinical image quality in order to determine if it is sufficient for the intended clinical task.
Currently, there is no method to assess clinical image quality with respect to the performance of the radiologist. The standard method to evaluate the quality of processed images is with human observers. However, this approach is very time-consuming and expensive. As a possible solution to assess image quality, we are investigating the use of mathematical model observers (MO), instead of human readers, in combination with images of anthropomorphic breast phantoms, as a stand-in for the patient. MOs are mathematical algorithms that are able to interpret the statistical information of the images. MOs can provide estimates of image quality in terms of detectability of certain signals in images. However, the introduction of MOs for image quality assessment and quality control procedures is not trivial. The methodology has to be efficient and robust, and the MOs need to predict human detection performance.
To achieve this, in Chapter 1 we propose optimized QC procedures using MOs and anthropomorphic breast phantoms. This is the first step to introduce MOs for the assessment of the quality of DM images containing calcification-like signals. MOs, and in particular the non-prewhitening with eye filter (NPWE) MO, require a template of the signal that needs to be detected. Since we are dealing with acquired DM images, as opposed to simulated ones, we used acquired images of calcification-like signals to construct this template. For this, a number of x-ray exposures of the calcification-like signals was needed. The number of exposures needed to construct the template and its effect on the MO detection scores were investigated. In a QC test, for practical reasons, the number of exposures needs to be as low as possible. This was found to be achievable as one exposure of the calcification-like signals was found to be sufficient to construct the template needed for use by the MO.
In Chapter 2 and Chapter 3 of this thesis, the visibility of a calcification-like signal was assessed by human observers and by two MOs, the NPWE and the channelized-Hotelling observer (CHO), respectively. The NPWE MO used as a signal template the one obtained with the methodology described in Chapter 1. We used DM images of a prototype anthropomorphic breast phantom that was 3D printed based on a patient’s breast CT image. Calcification-like disks were embedded within the phantom. DM images of the phantom with these disks were acquired on two clinical DM systems from different vendors. The detection of the calcificationlike signals in DM images with and without vendor-specific post-processing was assessed by human observers and by both MOs. It was found that human detection performance was strongly correlated to that of the MOs in both unprocessed and processed images. This means that MOs are able to predict human detection performance in both unprocessed and processed images for the detection of calcification-like signals. This suggests that MOs could be used to score images for image quality assessment.
However, MOs, and in particular the CHO, involve a large number of input parameters in the mechanism that detects the meaningful image features. Chapter 4 evaluated a method to optimize the parameters of the difference-of-Gaussian (DOG) channel set. For a range of parameters, their effect on the correlation between CHO and human observer performance was determined. It was found that the optimal parameters that maximize the correlation between CHO and human observer performance are dependent on the type of signal and background in the images being evaluated. This means that the parameters are not fixed values but need to be adjusted according to the testing images.
DM has a major limitation: tissue superposition, which can result in the projection of normal breast tissue mimicking a cancer or masking a cancer. This limitation led to the development of digital breast tomosynthesis (DBT). In DBT, instead of a single 2D image, a stack of images rendering the whole breast volume is acquired, ameliorating the issue of tissue superposition. Before MOs are introduced for DBT image quality evaluation, their ability to predict the performance of human readers needs to be evaluated. However, it is not known how human readers perceive DBT images in order to detect the lesion; integrating the 3D image stack as one object or separately processing the collection of 2D slices. Chapter 5 investigated how DBT images are perceived by the human visual system. Simulated 3D DBT images from a
small-scale breast tissue model were generated. The effect of the 3D image stack viewed in ciné-mode and in 2D central-slice viewing mode was investigated for the detection of a spherical and a capsule-like lesion. The scores of 11 human readers showed that there was no benefit in showing the whole DBT stack of slices; in fact, readers performed worse and their reading times were higher compared to the 2D central-slice viewing mode. These findings will aid the further development of MOs for DBT image quality assessment. From the clinical perspective, there appears to be no inherent visual perception benefit when viewing the entire DBT stack. In the future, if an ‘ideal’ synthetic image (generated such that it contains all diagnostic information found in the DBT stack and in a detectable manner) is developed
then this ‘ideal’ synthetic 2D image could potentially be sufficient for detecting low-contrast lesions in DBT images.