Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online before print October 24, 2002, 10.1148/radiol.2253011582
This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2253011582v1
225/3/907    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gilhuijs, K. G. A.
Right arrow Articles by Schultze Kool, L. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gilhuijs, K. G. A.
Right arrow Articles by Schultze Kool, L. J.
(Radiology 2002;225:907-916.)
© RSNA, 2002


Technical Developments

Breast MR Imaging in Women at Increased Lifetime Risk of Breast Cancer: Clinical System for Computerized Assessment of Breast Lesions—Initial Results1

Kenneth G. A. Gilhuijs, PhD, Eline E. Deurloo, MD, Sara H. Muller, PhD, Johannes L. Peterse, MD and Leo J. Schultze Kool, MD, PhD

1 From the Departments of Radiology (K.G.A.G., E.E.D., S.H.M., L.J.S.K.) and Pathology (J.L.P.), The Netherlands Cancer Institute, Antoni van Leeuwenhoek Hospital, Plesmanlaan 121, 1066 CX Amsterdam, the Netherlands. From the 2000 RSNA scientific assembly. Received September 25, 2001; revision requested December 10; revision received March 1, 2002; accepted April 2. Address correspondence to K.G.A.G.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
The authors developed a clinical system for computerized delineation, rating, and classification of breast lesions depicted in contrast material–enhanced magnetic resonance images obtained in women with increased lifetime risk of breast cancer. Initial results showed negative predictive values above 98% at 50% positive predictive value with negligible interoperator differences. The system demonstrated potential to help exclude malignancy with high confidence and reproducibility with a positive predictive value that is acceptable in screening.

© RSNA, 2002

Index terms: Breast, MR, 00.121412, 00.12143 • Breast neoplasms, diagnosis, 00.30, 00.311, 00.32 • Breast neoplasms, MR, 00.121412, 00.12143 • Cancer screening • Genes and genetics


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
Extensive breast cancer screening programs are currently effective worldwide. Women with a proven genetic predisposition or a strong family history of breast cancer are at increased risk at a much younger age than is the general population (1,2), and their lifetime risk can reach 90%. However, approximately 40%–50% of the screening mammograms obtained in premenopausal women may not provide sufficient diagnostic information to exclude the presence of malignancies owing to the presence of dense fibroglandular tissue (35). Contrast material–enhanced magnetic resonance (MR) imaging of the breast is reported to be the most sensitive modality for invasive breast cancer, with sensitivities well above 90% (6), but it has demonstrated lower and variable specificities (20%–90%) (69). Possible lack of specificity is of particular concern in the screening of premenopausal women with increased lifetime risk of breast cancer, in whom the prevalence of malignancy is low (1%–2%) (10) but that of benign enhancement is considerably higher (>20%) (11,12). Because of the high risk, high certainty of exclusion of malignant disease (ie, high negative predictive value [NPV]) is desirable, but a high biopsy rate of benign lesions (ie, low positive predictive value [PPV]) is not desirable because of the high psychologic stress and physical scarring for the patient and the difficulty of performing a biopsy of small lesions that are visible at only MR imaging. The effectiveness of contrast-enhanced MR imaging for the screening of asymptomatic disease in women at increased lifetime risk of breast cancer is being investigated currently in various multiinstitutional trials.

Differences in image interpretation guidelines contribute to the different specificities reported (9,13). Various investigators have focused on objective and quantitative rules to standardize the interpretation of contrast-enhanced MR images on the basis of temporal or morphologic characteristics of contrast material uptake (1418). To date, most methods are based on the rating of features such as spiculation, washout, and peripheral enhancement by radiologists, followed by the merging of these ratings with an automated classifier (1720). With this approach, the classification is objective, but careful construction of rating guidelines (21) may be necessary to avoid substantial inter- and intraobserver variations in ratings. Unlike in computerized analysis of mammograms (2227), only a few investigators (2830) have pursued both automated rating and automated classification of features to optimize the objectivity and the consistency of interpretation of contrast-enhanced MR images.

The goals of this study were directed toward optimization of the efficacy of MR screening programs for asymptomatic disease in women with increased lifetime risk of breast cancer. The first goal was to train and validate a lesion analysis system to distinguish accurately and consistently between malignant and benign lesions. The second goal of this study was to select a screening-specific operating point at which to attain a previously selected combination of accuracy in exclusion of malignancy of a lesion and in avoidance of biopsy of benign lesions.


    Materials and Methods
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
Image Database
The current study was performed with data analyzed retrospectively that were obtained after the institutional review board gave their approval and the patients provided their written informed consent and with data analyzed retrospectively that were acquired for accepted clinical indications. A training database was constructed from 80 breast lesions that were smaller than 4 cm3 (in 66 women [mean age, 47 years; age range, 20–84 years]), were visible at contrast-enhanced MR imaging, and were reported in our clinic. The database consisted of 40 benign lesions (in 31 women) and 40 malignant lesions (in 35 women) that were added consecutively in our clinic between January 1999 and October 2000. Of the 31 women with 40 benign lesions, 23 had one lesion, seven had two lesions, and one had three lesions. Of the 35 women with 40 malignant lesions, 30 had one lesion and five had two lesions.

Indications for examination included findings at contrast-enhanced MR breast screening (in 30 women with 39 benign lesions and one woman with one malignant lesion), a question of extent and multifocality of the lesion before breast-conserving surgery (in 32 women with 37 malignant lesions), or an unclear diagnosis based on findings in conventional mammograms or ultrasonographic (US) images (in one woman with one benign lesion and two women with two malignant lesions). In this study, a minimum risk assessment of 15% was used for MR screening of women at increased lifetime risk for breast cancer. Breast lesions from populations other than a screening population were included in the current study to allow initial training of our system with sufficient characteristics of malignant lesions despite the low prevalence of malignancy in our screening population.

All premenopausal women underwent imaging between the 5th and 15th day of their menstrual cycle, and they underwent repeat imaging if large motion artifacts were apparent. The distribution of lesion size, as estimated from the MR images, is shown in Figure 1.



View larger version (25K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. Bar graph indicates the size of lesions included in this study. cc = cm3.

 
Histopathologic proof was available for all the malignant lesions and for 16 of the 40 benign lesions (Table 1). Of the remaining 24 lesions, 19 showed transient enhancement (12,31,32) for which further work-up was recommended but that was no longer visible at follow-up. The remaining five lesions were benign on the basis of unchanged appearance (two with 2 years of follow-up; two with 3 years; and one with 4 years). Eleven of the 35 patients with 40 malignant lesions had proven BRCA1/2 germ line mutations (n = 1) or at least first-degree familial involvement (n = 10). All 31 patients with 40 benign lesions had proven genetic predisposition (n = 6) or familial involvement (n = 25).


View this table:
[in this window]
[in a new window]

 
TABLE 1. Findings in Benign and Malignant Lesions

 
With use of clinical reporting guidelines from an ongoing multiinstitutional trial, 23 biopsy procedures were performed in 14 of the 39 screening-detected benign lesions in our training database. The biopsy procedures included US-guided fine-needle aspiration, core, and excisional biopsies. By including findings at fine-needle aspiration, core, or excisional biopsy of a benign lesion as false-positive and by taking the estimated prevalence of malignancy into account, the clinical PPV in our screening population was estimated to be 20% (assuming 100% sensitivity).

MR Imaging Technique
MR imaging was performed with a 1.5-T system (Somatom; Siemens Medical Systems, Erlangen, Germany) with fast low-angle shot, or FLASH, three-dimensional MR imaging with the patient prone and both breasts in a double-breast array coil. One series of precontrast MR images was obtained before power injection (4 mL/sec) of 0.1 mmol of gadoteridol (Prohance; Bracco-Byk Gulden, Konstanz, Germany) per kilogram of body weight and was followed up by acquisition of five series of postcontrast images at intervals of approximately 90 seconds. Acquisition of the first series of postcontrast MR images was initiated at 45 seconds after the start of contrast material injection. The following MR imaging parameters were used: T1-weighted sequence, repetition time msec/echo time msec of 8.1/4.0, reconstructed in-plane matrix of 256 x 256 pixels, isotropic in-plane resolution of 1.2 x 1.2 mm2 or 1.4 x 1.4 mm2, section thickness of 1.6 or 1.4 mm, and no fat suppression. Data for all images were transferred to the lesion analysis system by using the digital imaging and communications in medicine, or DICOM, protocol.

Lesion Analysis System
The lesions were analyzed retrospectively by a medical researcher (E.E.D.) and an imaging physicist (K.G.A.G.), who independently used the lesion analysis workstation developed at our institution. This system is an extension of collaborative work between three different institutions in Europe and the United States (29,33). The operators of the workstation were blinded to each other’s results and histologic outcome.

The reviewers each performed the following steps: Each lesion was localized by using the interactive detection function of the system and was verified on the basis of radiologists’ reports. The detection function offers fast exploration of the MR data by means of linked cursors in three reconstructed views (sagittal, transverse, and coronal) and with two types of processing (subtraction and washout) simultaneously (Fig 2).



View larger version (137K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2. Processed contrast-enhanced MR images: interactive lesion detection. Top row: Uptake images. Bottom row: Washout images. From left to right: Sagittal, transverse, and coronal reconstructions of the processed MR images. A lesion (infiltrating ductal carcinoma) with uptake and corresponding washout is selected at the center of the crosshairs for further analysis. Movement of any of the six crosshairs results in instant updating of all reconstructed views to maintain correspondence in three directions and in two types of processing simultaneously.

 
Each lesion was delineated (segmented) automatically in and across sections (in three dimensions) after manual indication of a seed point in the suspicious lesion (33). The result of the segmentation was depicted by the system to verify proper coverage of the enhanced volume. When necessary, the location of the seed point was adjusted to achieve better coverage. To quantify variations in results that were due to discrepancies in the selected seed point, segmentation of each lesion was included in the task list for both reviewers.

Morphologic and temporal features (Table 2) were automatically rated with the system by using a region of interest with dimensions that were identical to those of the segmented lesion or were derived from it. Each feature was determined by means of analysis in three dimensions rather than in two dimensions; consequently, no "representative" section had to be chosen. In general, lesions with inhomogeneous uptake and spiculated boundaries have low values for smoothness of uptake. Lesions with well-defined margins have large mean values for margin sharpness, and lesions with partially well-defined margins have high variation in margin sharpness values. Each morphologic feature was derived from the first subtraction images (postcontrast MR images 1 minus the precontrast MR images) and from the subtraction images at the frame where the feature had the largest value (maximum across subtractions). Computation of the morphologic features has been described in detail (29). The temporal features of washout and signal-enhancing ratio have also been described (16,20).


View this table:
[in this window]
[in a new window]

 
TABLE 2. Features Rated Automatically with Computerized Analysis of Contrast-enhanced MR Images by Using Semiautomatically Generated Regions of Interest

 
First goal: training and validation of the lesion analysis system.—The lesion analysis system was trained on the temporal and morphologic features of the lesions included in the training database to provide objective estimates of the likelihood of malignancy for new lesions.

Statistical analysis was performed (SPSS, version 9.0; SPSS, Chicago, Ill). The feature values were first tested for deviations from the normal distribution by using Q-Q plots and Kolmogorov-Smirnov tests. Log transformations were applied to improve the approximation to normal dispersion when necessary to avoid problems with subsequent statistical analysis. In an exploratory analysis, statistically significant indicators of malignancy were identified by means of the Wilks {lambda} test. A P value of less than .05 was considered to indicate a significant ability to differentiate between benign and malignant lesion types.

Prospective performance of the lesion analysis system was estimated by means of leave-one-out cross validation, which involves training the system on all lesions but one, testing on the one lesion, and repeating the procedure until all lesions have been tested once. In addition to this commonly used technique, a conservative splitting validation was used, which involves random splitting of the database into an independent training set and a test set of equal size.

Linear discriminant analysis and stepwise selection (entry and removal threshold probabilities of F set to 0.10 and 0.15, respectively) were used to obtain a subset of features that makes a significant contribution to the classification into benign and malignant lesion types. All features listed in Table 2 were included in this selection process. Possible correlation between features was taken into account at this stage, but interaction terms were not entered because we expected the power of our statistics to be compromised in an attempt to avoid selection of terms that are predictive by coincidence. The likelihood of malignancy of each lesion was quantified by means of the Bayesian posterior probability (34), which was implemented in our lesion analysis system.

The statistical significance of the difference in results obtained by the two operators (ie, the interobserver variation) was assessed by using a paired-samples T test of the likelihood-of-malignancy values for each lesion. In addition, the statistical significance of the difference in the true-positive fraction (TPF) at selected false-positive fraction (FPF) was evaluated by means of receiver operating characteristic (ROC) analysis combined with univariate z score tests (35,36). The ROC software was provided by the University of Chicago (37,38).

Second goal: selection of screening-specific operating point.—A clinically relevant value of likelihood of malignancy was derived to serve as a threshold below which the system advises follow-up (benign lesion) or otherwise immediate biopsy (malignant lesion). In the following description, this threshold value will be referred to as the "operating point."

Parameterized ROC analysis (35) of the likelihood-of-malignancy values was performed to determine the sensitivity, or TPF, of the system for specificity, or 1 - FPF, values between 0% and 1%. Proper parameterization of each ROC curve was verified by means of visual comparison of the fitted curve with the unfitted data. Each point on the curve corresponds to a unique operating point. However, compared with the balanced training set used in the current study, the prevalence of benign lesions is much higher than that of malignant lesions in the actual screening population. To determine a clinically relevant operating point, each pair of (TPF, FPF) coordinates on the ROC curve was transformed to the corresponding pair of PPV (the fraction of lesions that is correctly assumed to be malignant) and NPV (the fraction of lesions that is correctly assumed to be benign) coordinates. The expected prevalence of malignancy (PM) in the screening population was taken into account in this transformation: PPV = TPF x PM/[(TPF x PM) + (1 - PM) x FPF], and NPV = [(1 - FPF) x (1 - PM)]/[(1 - FPF) x (1 - PM) + (1 - TPF) x PM].

The resulting (PPV, NPV) curve will be referred to as the "predictive curve." The (TPF, FPF) coordinates of the ROC 95% confidence band points—which are given by the software (37)—were also converted by using the equations for PPV and NPV to produce the confidence band for the predictive curve. Calculations were performed by implementing the equations in a spreadsheet program (Excel 2000; Microsoft, Redmond, Wash).

Because the analysis system is applied to only MR images in which enhancement of lesions is visible (images without enhancement do not raise suspicion), the estimation of prevalence of malignancy was also limited to the subset of enhancing lesions. The resulting PPV is similar to that for the entire population because the majority of images that show no enhancement indicate accurately the absence of disease (ie, a true benign finding), and true benign findings do not affect the TPF and the PPV. For similar reasons, however, the NPV for the entire population is higher than that for the subset of suspicious lesions, and the corresponding FPF is lower. The distinction between these two estimates of NPV will be indicated clearly in the remainder of this article by using the labels "subset of suspicious lesions" or "total population." The reason we refer to both estimates is that the use of the lesion analysis system is limited to the subset of suspicious lesions, whereas clinical effect is typically assessed in the total population.

The prevalence of malignant lesions in the total screening population (2%) was estimated preliminary from data for screening and those in the literature (10). The prevalence of benign lesions in the total population was estimated retrospectively from clinical screening data at our hospital (23%). Consequently, the prevalence of malignancy among suspicious lesions was set to 8%. The predictive curve was used to select an operating point that corresponds to clinically acceptable PPV where the NPV is at least 98% in the subset of suspicious lesions (ie, less than 2% chance of misinterpreting an enhancing malignant lesion as a benign lesion in the current screening round). The minimally accepted PPV in screening may vary in different regions. In this study, a value of 50% was used as a guideline (ie, one of two biopsies are of benign lesions). This guideline is in agreement with the assessment by other authors (39) and is comparable to that at mammography screening for women aged at least 50 years in the Netherlands (4042).


    Results
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
First Goal
Training and validation of the lesion analysis system.—With the lesion analysis system, results obtained by the medical researcher (E.E.D.) were used as the reference standard in the following validation. Lesions with better average definition of the margin were found to show higher speed of uptake and larger variation in definition along the margins (P < .001). This trend was significant in both the benign and malignant subgroups. No significant correlation was found between the feature values and the size of the lesion.

The efficacy of each computer-rated feature, as shown in Table 3, was ranked from high to low ability to differentiate between benign and malignant lesions. Positive correlation indicates that higher values correspond with a higher likelihood of malignancy, whereas negative correlation indicates that higher values correspond with a lower likelihood of malignancy.


View this table:
[in this window]
[in a new window]

 
TABLE 3. Area under ROC Curve and Significance of Discrimination between Benign and Malignant Lesions on the Basis of Computer-rated Features

 
In accordance with findings in other studies (8,1517,19,20), washout and signal-enhancing ratio were strong indicators of malignancy. They are followed by the morphologic features of smoothness of uptake and mean margin sharpness.

Stepwise selection resulted in the combination of one temporal and three morphologic features: washout, smoothness of uptake (maximum across subtractions), mean margin sharpness (first subtraction), and variation in margin sharpness (maximum across subtractions) (P < .001 for the combined features). Signal-enhancing ratio was excluded because of its strong correlation with washout. With cross validation and stepwise feature selection, the estimated prospective performance of the system was 0.95 ± 0.02 (area under the ROC curve [Az] ± 1 SD). This Az measure indicates the relative area under the ROC curve in Figure 3. A perfect performance without false-positive and false-negative findings would yield a relative area under the ROC curve of 1.0. Although variation in margin sharpness (maximum across subtractions) is a poor indicator of malignancy by itself (Table 3), it provides a statistically significant contribution in combination with the other selected features. On average, the margins of the malignant lesions were not only found to be more sharply delineated than those of the benign lesions, but they also showed less variation in sharpness.



View larger version (22K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3. Graph of the ROC curve shows the prospective estimate of the TPF of the system at various FPF values in the task of discriminating between benign and malignant lesions. The curve was derived from leave-one-out validation of the training set of suspicious lesions. The fitted binormal ROC parameters a and b are 2.02 and 0.66, respectively.

 
Eight of the 40 benign lesions exhibited positive washout. These findings include fibroadenoma (in five of 15 lesions), transient enhancement (in two of 19 lesions), and intramammary lymph node enhancement (in one of one lesion). Conversely, 11 of the 40 malignant lesions exhibited negative washout (ie, continuous wash-in). These findings include infiltrating ductal carcinoma (in eight of 33 lesions), infiltrating lobular carcinoma (in two of two lesions), and ductal carcinoma in situ (in one of four lesions). However, most benign (seven of eight) and malignant (nine of 11) lesions with deviating washout behavior were properly characterized on the basis of their morphologic properties.

Second Goal
Selection of screening-specific operating point.Figure 4 illustrates that the maximum PPV of the system is estimated to remain greater than 50% at an NPV of at least 98% for prevalence values roughly between 0.5% and 10% (in the total screening population). The peak performance of the system is estimated to occur at a prevalence of approximately 1%. The predictive curve at the expected prevalence in our screening population (2%) is depicted in Figure 5. The estimated prospective performance at the operating point of PPV = 50% is summarized in Table 4. We emphasize that these are first estimates of prospective performance derived from the tuning process of the lesion analysis system.



View larger version (28K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4. Graph shows estimates of the largest feasible PPV with the system at NPV of at least 98% for various values of prevalence of malignancy (PM) (top axis) in the total population (ie, including normal images) and (bottom axis) in the population of suspicious lesions only. The maximum (PPV = 87%) is at a prevalence of 1.2% (total population).

 


View larger version (29K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 5. Graph shows the estimate of the predictive curve of the lesion analysis system at 2% prevalence of malignancy in the subset of suspicious lesions. Details at the operating point (arrow) are listed in Table 4.

 

View this table:
[in this window]
[in a new window]

 
TABLE 4. Estimates of Prospective Performance of Lesion Analysis System to Discriminate between Benign and Malignant Lesions in Screening Population of Women at Increased Lifetime Risk

 
Results in the current study indicate the feasibility of achieving a PPV of 50% at an NPV of 98% in the subset of suspicious lesions and at an NPV of 99% in the total population. These results are roughly in agreement with those obtained from four independent splitting validations. At 98% NPV in the subset of suspicious lesions, the following averages were derived from the four individual curves: PPV = 47.3% ± 6.9 [mean ± 1 SD], TPF = 78.4% ± 0.5, and FPF = 7.9% ± 2.1. The splitting validations are considered to be conservative estimators of prospective performance because only half the total number of benign and malignant cases is used during training, which results in fewer selected features.

At operating point of PPV = 50% (TPF = 85.7%), false-negative findings were of infiltrating ductal carcinoma (lesion volumes of 0.2, 0.3, 0.5, and 2.9 cm3). None of these misclassified lesions were associated with a genetic predisposition or familial involvement. False-positive findings were transient enhancement (lesion volumes of 0.5 and 1.4 cm3) and fibroadenoma (lesion volume of 0.5 cm3). A systematic difference in values for likelihood of malignancy obtained by the first and second operators could not be found (P > .6). In fact, the differences in computed likelihood were less than 5% for 72 of the 80 lesions. The differences were not caused by discrepancies in interpretation of the likelihood of malignancy by the two operators but by variations in selection of the seed point during segmentation of the lesions. The effect of these variations is small. At the selected operating point, only one of the 80 lesions was characterized differently, which resulted in a difference in TPF between the two operators that was less than 1.5% (not significant [P > .9]).


    Discussion
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 
A computerized analysis system was developed to assist identification of subgroups of lesions with characteristics that can be correlated accurately with a low likelihood of malignancy. The system was made to help optimize the consistency of differential diagnosis of breast lesions at contrast-enhanced MR imaging and to allow selection of operating points at prior estimates of NPV and PPV. Such methods may contribute substantially to the benefit of MR screening programs for asymptomatic women at increased lifetime risk.

It is always possible to achieve a PPV equal to the prevalence of malignancy in the population because if all lesions are called malignant, then both the TPF and the FPF equal 1 and the PPV becomes equal to the prevalence. Consequently, in a symptomatic population associated with high prevalence of malignancy, the follow-up of suspicious findings by means of biopsy is more easily accepted than it is in an asymptomatic population associated with low prevalence. For symptomatic indications, computerized analysis may still help achievement of consensus in triple assessment (ie, physical examination, imaging, and pathologic examination), but more benefit is expected in screening, where the annual prevalence of malignancy is considerably lower than the desirable PPV. A first estimate of PPV obtained in our clinic for asymptomatic screening is 20%. The lesion analysis system has been tuned to achieve the desirable 50% PPV guideline at high NPV in a consistent and reproducible manner. Although the 50% PPV may not be representative for clinics in all regions, the system is easily tunable to other PPVs along an estimated curve of performance.

MR Screening
Initial literature reports on the screening of asymptomatic women at increased lifetime risk indicate detection of breast tumors in a substantially earlier stage of malignancy when the women are under surveillance rather than when they are referred because of symptoms (10). In 109 asymptomatic women at high risk, Tilanus-Linthorst and colleagues (43) report 100% sensitivity (three of three tumors), 94% specificity (100 of 106 women), and no false-negative findings with a follow-up of 1 year (ie, NPV = 100%, PPV = 33%). In 105 asymptomatic women, Kuhl and colleagues (12) report 100% sensitivity (nine of nine tumors), 95% specificity (91 of 96 women), and no false-negative findings with follow-up of at least 1 year (ie, NPV = 100%, PPV = 64%). However, this relatively large PPV is difficult to compare with findings in other studies because the false-positive findings were taken from a smaller population (n = 105) than were the true-positive findings (n = 192) to guarantee at least 1 year follow-up of benign findings. In addition, the prevalence of malignancy (approximately 5% [nine of 192 women]) was found to be somewhat higher than that reported in other studies and that observed in our hospital (approximately 2%).

Differences in sensitivity and specificity reported in the examination of the asymptomatic population compared with that in the symptomatic population may be attributed to the small number of screening cancers accumulated thus far. In addition, differences in specificity may be caused by differences in the prevalence of normal findings (no enhancement) in the symptomatic and asymptomatic groups because normal findings are typically interpreted as true-negative findings until proven otherwise at follow-up. Other differences may be caused by discrepancies in follow-up guidelines for suspicious findings and differences in image interpretation (9,12). Nonetheless, first results in the screening of asymptomatic women at high risk indicate high certainty of excluding malignant disease. At a PPV of much less than 50%, however, fewer biopsies of benign lesions are desired without compromising the NPV.

Features
The margin of malignant lesions was found to be sharper (ie, better defined) than that of benign lesions (P < .01); this observation may seem inconsistent with observations of suspicious masses in mammograms. However, margins depicted in mammograms reflect differences in tissue density, whereas margins depicted in contrast-enhanced MR images reflect differences in uptake of contrast material. The higher speed of uptake and peripheral enhancement (44) of malignant lesions may explain the superior sharpness of their margins compared with that of benign lesions. In the present study, the mean margin sharpness in the first subtraction images appeared to be a better indicator of malignancy than was the mean margin sharpness maximized for all the subtraction images for all time frames (maximum across subtractions). Apparently, the difference in definition of benign and malignant margins decreases over time.

The temporal features were computed in a region of interest that was limited to 80% of the segmented lesion with the largest feature values. Smaller percentage values had no obvious effect on the performance of the system. The 80% value was chosen to reduce possible influence of the partial volume effect while maintaining a sufficient number of voxels in smaller lesions.

The features selected in a pilot study that was performed with a much smaller independent database (29) are in agreement with a subset of the features selected in the current study. In the pilot study, smoothness of uptake and variation in margin sharpness resulted in an area under the ROC curve of 0.96. In the current study, temporal features were added, a larger independent set of lesions from a different imager and clinic were used, and semiautomated segmentation was used rather than manual segmentation. Consequently, two additional features were selected, which resulted in an area under the ROC curve of 0.95. These observations are an encouraging indication of the generalizing behavior of the computerized analysis.

Performance and Limitations
To achieve highly reproducible results, semiautomated segmentation of lesions has been combined with computerized rating and classification of features. The results indicate a clinically acceptable PPV (approximately 50%) at an NPV of at least 99% in the total screening population. Moreover, variations in results obtained by two independent operators of the system were found to be negligibly small.

The lesion analysis system is intended to be a tool that will provide radiologists with objective and consistent guidelines to attain prior selected PPVs and NPVs derived from statistics in past cases. Detection of suspicious lesions is accomplished exclusively by radiologists, although the interactive detection function of the system is currently in clinical use as an add-on feature to the existing image reading system. The efficacy of contrast-enhanced MR imaging in the detection of lesions in a screening population lies outside the scope of our current study, but it is one of the main focus points of multiinstitutional trials on MR screening. The decision to focus on computerized characterization before computerized detection follows from the known high sensitivity of contrast-enhanced MR imaging for invasive lesions. Because of the limited sensitivity of contrast-enhanced MR imaging to depict ductal carcinoma in situ (45,46), it is especially important that any enhancement related to ductal carcinoma in situ is interpreted accurately and consistently.

Cross validation of the system resulted in correct labeling of all cases of ductal carcinoma in situ in our database—although there were only a limited number—as malignant disease. Because automated rating and classification are currently tested independently from the determinations of the radiologists in our clinic, the interaction between the radiologists and the system has not yet been examined. We will investigate this topic in the future.

The current study was limited because most benign lesions were found at screening of asymptomatic women at increased lifetime risk, whereas the majority of malignant lesions were found in symptomatic patients. In this preliminary study, we combined the two populations to allow initial training of the system. Because of the low prevalence of malignancy in the screening population, approximately 2,000 asymptomatic women would need to be screened to allow detection of 40 malignant lesions. Provision of comparably sized sets of benign and malignant lesions is crucial to train the system to recognize the typical differences in characteristics. We emphasize that the combining of populations requires careful analysis and interpretation of data, while taking the limitations into account.

First, the distribution of sizes of benign and malignant lesions in the asymptomatic screening population is currently unknown and may be different from that in a symptomatic population. For example, the differences in size between the benign and malignant lesions in our database may be due in part to the fact that most malignant lesions were symptomatic (and were therefore more likely to be detected at a larger size), but they may also be a result of the inherently large difference in growth rates between benign and malignant lesions, as would also be true in an exclusively screening population. Given this uncertainty, size was not used as a feature in the current study, nor were any features used that were significantly correlated with size.

Second, differences may exist in MR image characteristics between sporadic cancers and those associated with a genetic predisposition. Differences in histologic phenotype, such as pushing margins, have been reported in cancers associated with BRCA1/2 (47), but the statistical significance of these observations on contrast-enhanced MR screening images is yet to be determined in larger sets of malignant tumors from both sporadic and high-risk populations. It is possible that pushing margins will improve the depiction of margin sharpness and decrease the variation in sharpness along the margin. If so, identification of these cancers by means of the analysis system will be facilitated, but such an effect can be assessed only in actual prospective application. We could not find an indication for increased failure of the system to characterize the 11 malignant lesions in our database that were associated with genetic predisposition or familial involvement.

Third, with a combined symptomatic and asymptomatic population, the PPV and NPV for an exclusive screening population cannot be determined in a straightforward manner. By transforming the ROC curve to a predictive curve based on a priori estimates of prevalence of malignancy in a screening population, the PPV and NPV in a screening population could be estimated from our training database. Although the actual prevalence may vary slightly in prospective application, the effects of such variation are not expected to result in large differences in performance of the system. Prospective estimates of PPV remained greater than 50% for a wide range of prevalence values.

To estimate the prospective performance of the system on the basis of the training population, a common test (cross validation) was used, and results were corroborated with conservative splitting validations. In addition, a simple classification method (linear-discriminant analysis) was used to minimize the risk of obtaining results that were specific for only the training population in our study. Nonetheless, prospective testing of the system on a larger set of multiinstitutional screening images remains necessary to validate its clinical performance and to fine-tune selection of the operating point. Currently, all lesions that are visible on MR images obtained in our clinic are added to the lesion analysis system.

Another topic of future investigation concerns the effect of the imaging technique. In the current study, a standard fast low-angle shot three-dimensional acquisition technique was chosen that balances temporal and spatial resolution at a voxel size of approximately 0.003 cm3. The size of the training population did not allow breakdown of the performance into multiple categories of lesion volume without the loss of statistical power. It is likely, however, that differentiation between very small benign and malignant lesions is more challenging than that between larger lesions. One way to compensate for this effect is to allow the analysis system to automatically adjust the operating point for small lesions to a different (but known a priori) trade-off between NPV and PPV. Automatic switching between operating points is feasible because the volume of the lesion is known after the segmentation process is complete.

In conclusion, the lesion analysis system has been tested successfully in a clinical environment. The combination of computer-rated washout, smoothness of uptake, mean margin sharpness, and variation of margin sharpness yields a significant contribution to characterization into benign and malignant lesion types. In the current study, first estimates of prospective performance indicate that the system is capable of excluding malignant disease with high confidence and reproducibility, at a clinically acceptable PPV in a screening setting. The ability to reproducibly select operating points on the basis of prior estimates of NPV and PPV may contribute substantially to the success of contrast-enhanced MR screening programs for asymptomatic women at increased lifetime risk, but prospective validation in an exclusive screening population remains necessary.


    ACKNOWLEDGMENTS
 
The authors thank Guus Hart, MSc for proofing the statistical techniques; Angelique Schlief for transferring the screening images and extracting prevalence data; Maryellen L. Giger, PhD, for discussing experimental computer-aided diagnosis design; and Harry Bartelink, MD, PhD, for proofreading the manuscript.


    FOOTNOTES
 
Abbreviations: FPF = false-positive fraction, NPV = negative predictive value, PPV = positive predictive value, ROC = receiver operating characteristic, TPF = true-positive fraction

Author contributions: Guarantors of integrity of entire study, K.G.A.G., E.E.D., L.J.S.K.; study concepts, K.G.A.G., S.H.M., E.E.D.; study design, K.G.A.G., E.E.D.; literature research, K.G.A.G., E.E.D.; clinical studies, E.E.D., S.H.M.; experimental studies, K.G.A.G., S.H.M.; data acquisition, E.E.D., S.H.M.; data analysis/interpretation, K.G.A.G., J.L.P., L.J.S.K.; statistical analysis, K.G.A.G.; manuscript preparation, K.G.A.G., E.E.D., S.H.M.; manuscript definition of intellectual content, K.G.A.G., J.L.P., L.J.S.K., E.E.D.; manuscript editing and revision/review, K.G.A.G., S.H.M., E.E.D.; manuscript final version approval, all authors.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 Materials and Methods
 Results
 Discussion
 REFERENCES
 

  1. Claus EB, Risch N, Thompson WD. Genetic analysis of breast cancer in the cancer and steroid hormone study. Am J Hum Genet 1991; 48:232-242.[Medline]
  2. Easton DF, Bishop DT, Ford D, Crockford GP. Genetic linkage analysis in familial breast and ovarian cancer: results from 214 families. The Breast Cancer Linkage Consortium. Am J Hum Genet 1993; 52:678-701.
  3. Stomper PC, D’Souza DJ, DiNitto PA, Arredondo MA. Analysis of parenchymal density on mammograms in 1353 women 25–79 years old. AJR Am J Roentgenol 1996; 167:1261-1265.[Abstract/Free Full Text]
  4. Lehman CD, White E, Peacock S, Drucker MJ, Urban N. Effect of age and breast density on screening mammograms with false-positive findings. AJR Am J Roentgenol 1999; 173:1651-1655.[Abstract]
  5. Zonderland HM, Coerkamp EG, Hermans J, van de Vijver MJ, van Voorthuisen AE. Diagnosis of breast cancer: contribution of US as an adjunct to mammography. Radiology 1999; 213:413-422.[Abstract/Free Full Text]
  6. Kelcz F, Santyr G. Gadolinium-enhanced breast MRI. Crit Rev Diagn Imaging 1995; 36:287-338.[Medline]
  7. Obdeijn IM, Kuijpers TJ, van Dijk P, Wiggers T, Oudkerk M. MR lesion detection in a breast cancer population. J Magn Reson Imaging 1996; 6:849-854.[Medline]
  8. Bone B, Pentek Z, Perbeck L, Veress B. Diagnostic accuracy of mammography and contrast-enhanced MR imaging in 238 histologically verified breast lesions. Acta Radiol 1997; 38:489-496.[Medline]
  9. Heywang-Kobrunner SH, Viehweg P, Heinig A, Kuchler C. Contrast-enhanced MRI of the breast: accuracy, value, controversies, solutions. Eur J Radiol 1997; 24:94-108.[CrossRef][Medline]
  10. Tilanus-Linthorst MM, Bartels CC, Obdeijn AI, Oudkerk M. Earlier detection of breast cancer by surveillance of women at familial risk. Eur J Cancer 2000; 36:514-519.
  11. Brown J, Smith RC, Lee CH. Incidental enhancing lesions found on MR imaging of the breast. AJR Am J Roentgenol 2001; 176:1249-1254.[Abstract/Free Full Text]
  12. Kuhl CK, Schmutzler RK, Leutner CC, et al. Breast MR imaging screening in 192 women proved or suspected to be carriers of a breast cancer susceptibility gene: preliminary results. Radiology 2000; 215:267-279.[Abstract/Free Full Text]
  13. Viehweg P, Paprosch I, Strassinopoulou M, Heywang-Kobrunner SH. Contrast-enhanced magnetic resonance imaging of the breast: interpretation guidelines. Top Magn Reson Imaging 1998; 9:17-43.[Medline]
  14. Boetes C, Barentsz JO, Mus RD, et al. MR characterization of suspicious breast lesions with a gadolinium- enhanced TurboFLASH subtraction technique. Radiology 1994; 193:777-781.[Abstract/Free Full Text]
  15. Mussurakis S, Buckley DL, Drew PJ, et al. Dynamic MR imaging of the breast combined with analysis of contrast agent kinetics in the differentiation of primary breast tumours. Clin Radiol 1997; 52:516- 526.[CrossRef][Medline]
  16. Sherif H, Mahfouz AE, Oellinger H, et al. Peripheral washout sign on contrast-enhanced MR images of the breast. Radiology 1997; 205:209-213.[Abstract/Free Full Text]
  17. Nunes LW, Schnall MD, Orel SG, et al. Breast MR imaging: interpretation model. Radiology 1997; 202:833-841.[Abstract/Free Full Text]
  18. Nunes LW, Schnall MD, Orel SG. Update of breast MR imaging architectural interpretation model. Radiology 2001; 219:484-494.[Abstract/Free Full Text]
  19. Ikeda O, Yamashita Y, Morishita S, et al. Characterization of breast masses by dynamic enhanced MR imaging: a logistic regression analysis. Acta Radiol 1999; 40:585-592.[Medline]
  20. Kinkel K, Helbich TH, Esserman LJ, et al. Dynamic high-spatial-resolution MR imaging of suspicious breast lesions: diagnostic criteria and interobserver variability. AJR Am J Roentgenol 2000; 175:35-43.[Abstract/Free Full Text]
  21. Ikeda DM, Hylton NM, Kinkel K, et al. Development, standardization, and testing of a lexicon for reporting contrast-enhanced breast magnetic resonance imaging studies. J Magn Reson Imaging 2001; 13:889-895.[CrossRef][Medline]
  22. Kegelmeyer WP, Jr, Pruneda JM, Bourland PD, Hillis A, Riggs MW, Nipper ML. Computer-aided mammographic screening for spiculated lesions. Radiology 1994; 191:331-337.[Abstract/Free Full Text]
  23. Veldkamp WJ, Karssemeijer N, Otten JD, Hendriks JH. Automated classification of clustered microcalcifications into malignant and benign types. Med Phys 2000; 27:2600-2608.[CrossRef][Medline]
  24. Huo Z, Giger ML, Vyborny CJ, et al. Analysis of spiculation in the computerized classification of mammographic masses. Med Phys 1995; 22:1569-1579.[CrossRef][Medline]
  25. Rangayyan RM, El-Faramawy N, Desautels JEL, Alim OA. Discrimination between benign and malignant breast tumors using a region-based measure of edge profile acutance In: Proceedings of the 3rd International Workshop on Digital Mammography. Amsterdam, the Netherlands: Elsevier Science, 1996; 213- 218.
  26. Chan HP, Sahiner B, Helvie MA, et al. Improvement of radiologists’ characterization of mammographic masses by using computer-aided diagnosis: an ROC study. Radiology 1999; 212:817-827.[Abstract/Free Full Text]
  27. Jiang Y, Nishikawa RM, Schmidt RA, Metz CE, Giger ML, Doi K. Improving breast cancer diagnosis with computer-aided diagnosis. Acad Radiol 1999; 6:22-33.[CrossRef][Medline]
  28. Sinha S, Lucas-Quesada FA, DeBruhl ND, et al. Multifeature analysis of Gd-enhanced MR images of breast lesions. J Magn Reson Imaging 1997; 7:1016-1026.[Medline]
  29. Gilhuijs KG, Giger ML, Bick U. Computerized analysis of breast lesions in three dimensions using dynamic magnetic-resonance imaging. Med Phys 1998; 25:1647-1654.[CrossRef][Medline]
  30. Penn AI, Bolinger L, Schnall MD, Loew MH. Discrimination of MR images of breast masses with fractal-interpolation function models. Acad Radiol 1999; 6:156-163.[CrossRef][Medline]
  31. Friedrich M. MRI of the breast: state of the art. Eur Radiol 1998; 8:707-725.[CrossRef][Medline]
  32. Kuhl CK. MRI of breast tumors. Eur Radiol 2000; 10:46-58.[CrossRef][Medline]
  33. Gilhuijs KG, Giger ML, Bick U. A method for computerized assessment of tumor extent in contrast-enhanced MR images of the breast. In: Doi K, MacMahon H, Giger ML, Hoffmann KR, eds. Computer-aided diagnosis in medical imaging. Amsterdam, the Netherlands: Elsevier Science, 1999; 305-310.
  34. Johnson RA, Wichern DW. Discrimination and classification. In: Conmy SR, eds. Applied multivariate statistical analysis. 3rd ed. Upper Saddle River, NJ: Prentice-Hall, 1992; 494-502.
  35. Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986; 21:720-733.[Medline]
  36. Metz CE, Wang P, Kronman HB. A new approach for testing the significance of differences between ROC curves measured from correlated data. In: Deconink F, eds. Information processing in medical imaging. The Hague, the Netherlands: Nijhoff, 1984; 432-445.
  37. Shen J, Herman B, Wang P, Kronman HB, Dorfman DD, Metz CE. LABROC1 program for the IBM PC. Available at: http://www-radiology.uchicago.edu/cgi-bin/software.cgi 2001; Accessed April 24.
  38. Shen J, Herman B, Kronman HB, Wang P, Dorfman DD, Metz CE. CLABROC program IBM-PC. Version 1.2.1. Available at: http://www-radiology.uchicago.edu/cgi-bin/software.cgi 2001; Accessed April 24.
  39. Heywang-Kobrunner SH, Bick U, Bradley WG, Jr, et al. International investigation of breast MRI: results of a multicentre study (11 sites) concerning diagnostic parameters for contrast-enhanced MRI based on 519 histopathologically correlated lesions. Eur Radiol 2001; 11:531-546.[CrossRef][Medline]
  40. Peeters PH, Verbeek AL, Hendriks JH, van Bon MJ. Screening for breast cancer in Nijmegen: report of 6 screening rounds, 1975–1986. Int J Cancer 1989; 43:226-230.[Medline]
  41. Otten JD, van Dijck JA, Peer PG, et al. Long term breast cancer screening in Nijmegen, the Netherlands: the nine rounds from 1975–92. J Epidemiol Community Health 1996; 50:353-358.[Abstract]
  42. Fracheboud J, de Koning HJ, Beemsterboer PM, et al. Nation-wide breast cancer screening in the Netherlands: results of initial and subsequent screening 1990–1995. National Evaluation Team for Breast Cancer Screening. Int J Cancer 1998; 75:694-698.
  43. Tilanus-Linthorst MM, Obdeijn IM, Bartels KC, de Koning HJ, Oudkerk M. First experiences in screening women at high risk for breast cancer with MR imaging. Breast Cancer Res Treat 2000; 63:53-60.[CrossRef][Medline]
  44. Mussurakis S, Gibbs P, Horsman A. Peripheral enhancement and spatial contrast uptake heterogeneity of primary breast tumours: quantitative assessment with dynamic MRI. J Comput Assist Tomogr 1998; 22:35-46.[CrossRef][Medline]
  45. Gilles R, Zafrani B, Guinebretiere JM, et al. Ductal carcinoma in situ: MR imaging-histopathologic correlation. Radiology 1995; 196:415-419.[Abstract/Free Full Text]
  46. Westerhof JP, Fischer U, Moritz JD, Oestmann JW. MR imaging of mammographically detected clustered microcalcifications: is there any value? Radiology 1998; 207:675-681.[Abstract/Free Full Text]
  47. Armes JE, Egan AJ, Southey MC, et al. The histologic phenotypes of breast carcinoma occurring before age 40 years in women with and without BRCA1 or BRCA2 germline mutations: a population-based study. Cancer 1998; 83:2335-2345.[CrossRef][Medline]



This article has been cited by other articles:


Home page
RadiologyHome page
S. Schrading and C. K. Kuhl
Mammographic, US, and MR Imaging Phenotypes of Familial Breast Cancer
Radiology, January 1, 2008; 246(1): 58 - 70.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
T. C. Williams, W. B. DeMartini, S. C. Partridge, S. Peacock, and C. D. Lehman
Breast MR Imaging: Computer-aided Evaluation Program for Discriminating Benign from Malignant Lesions
Radiology, July 1, 2007; 244(1): 94 - 103.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
E. Yeh, P. Slanetz, D. B. Kopans, E. Rafferty, D. Georgian-Smith, L. Moy, E. Halpern, R. Moore, I. Kuter, and A. Taghian
Prospective Comparison of Mammography, Sonography, and MRI in Patients Undergoing Neoadjuvant Chemotherapy for Palpable Breast Cancer
Am. J. Roentgenol., March 1, 2005; 184(3): 868 - 877.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
E. E. Deurloo, S. H. Muller, J. L. Peterse, A. P. E. Besnard, and K. G. A. Gilhuijs
Clinically and Mammographically Occult Breast Lesions on MR Images: Potential Effect of Computerized Assessment on Clinical Reading
Radiology, March 1, 2005; 234(3): 693 - 701.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2253011582v1
225/3/907    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gilhuijs, K. G. A.
Right arrow Articles by Schultze Kool, L. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gilhuijs, K. G. A.
Right arrow Articles by Schultze Kool, L. J.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
RADIOLOGY RADIOGRAPHICS RSNA JOURNALS ONLINE