|
|
||||||||
Breast Imaging |
1 From the Department of Radiology, Addenbrooke's Hospital, Cambridge, England (R.M.L.W.); Cancer Research-UK Genetic Epidemiology Unit, Cambridge, England (D.T., D.F.E.); Study Coordinating Office, Section of Magnetic Resonance, Institute of Cancer Research, Royal Marsden Hospital, Downs Rd, Sutton Surrey SM2 5PT, England (L.J.P., R.H., M.O.L.); Department of Radiology, University of Aberdeen, Aberdeen, Scotland (F.J.G.); Breakthrough Breast Cancer Research Centre, Institute of Cancer Research, Royal Marsden Hospital, London, England (S.R.L.); and Mount Vernon Hospital, Northwood, Middlesex, England (A.R.P.). The complete list of MARIBS group members and their affiliations is listed in Appendix E1 (radiology.rsnajnls.org/cgi/content/full/2393042007/DC1). Received November 25, 2004; revision requested January 27, 2005; revision received May 5; accepted June 3; final version accepted August 1. Supported by a project grant from the United Kingdom Medical Research Council. D.F.E. and D.T. supported by Cancer Research U.K. Genetic Epidemiology Unit. Address correspondence to M.O.L. (e-mail: martin{at}icr.ac.uk).
| ABSTRACT |
|---|
|
|
|---|
Materials and Methods: All participating patients provided written informed consent. Ethics committee approval was obtained. The results of 1541 contrast materialenhanced breast MR imaging examinations were analyzed; 1441 screening examinations were performed in 638 women aged 2451 years at high risk for breast cancer, and 100 examinations were performed in 100 women aged 2381 years. Lesion analysis was performed in 991 breasts, which were divided into design (491 breasts) and testing (500 breasts) sets. The reference standard was histologic analysis of biopsy samples, fine-needle aspiration cytology, or minimal follow-up of 24 months. The scoring system involved the use of five features: morphology (MOR), pattern of enhancement (POE), percentage of maximal focal enhancement (PMFE), maximal signal intensitytime ratio (MITR), and pattern of contrast material washout (POCW). The system was evaluated by means of (a) assessment of interreader agreement, as expressed in
statistics, for 315 breasts in which both readers analyzed the same lesion, (b) assessment of the diagnostic accuracy of the scored components with receiver operating characteristic curve analysis, and (c) logistic regression analysis to determine which components of the scoring system were critical to the final score. A new simplified scoring system developed with the design set was applied to the testing set.
Results: There was moderate reader agreement regarding overall lesion outcome (ie, malignant, suspicious, or benign) (
= 0.58) and less agreement regarding the scored components. The area under the receiver operating characteristic curve (AUC) for the overall lesion score, 0.88, was higher than the AUC for any one component. The components MOR, POE, and POCW yielded the best overall result. PMFE and MITR did not contribute to diagnostic utility. Applying a simplified scoring system to the testing set yielded a nonsignificantly (P = .2) higher AUC than did applying the original scoring system (sensitivity, 84%; specificity, 86.0%).
Conclusion: Good diagnostic accuracy can be achieved by using simple qualitative descriptors of lesion enhancement, including POCW. In the context of screening, quantitative enhancement parameters appear to be less useful for lesion characterization.
Supplemental material: radiology.rsnajnls.org/cgi/content/full/2393042007/DC1
© RSNA, 2006
| INTRODUCTION |
|---|
|
|
|---|
When the study protocol was written in 1996, the scoring system was devised by using the literature of that period as the basis (2). It should be noted that the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) lexicon (3) was developed after our study had commenced (4). The enhancement characteristics of individual breast lesions were scored in five categories, and the aggregated score was used to classify the lesions as malignant, suspicious, or benign. The details of the scoring system are given in Appendix E2 (Fig E2, radiology.rsnajnls.org/cgi/content/full/2393042007/DC1).
The recall rate for MR imaging in the high-risk population has been analyzed previously and was 10% (5). The background for the scoring decisions is described in the published protocol (2). The same features were subsequently used to create the BI-RADS breast MR imaging lexicon (3,4). The system for scoring was based on three morphologic features: lesion shape, uniformity or heterogeneity of the contrast material uptake, and shape of the contrast material washout curve in the dynamic examination. Two numerical features also were included: percentage of maximal focal enhancement (PMFE) and maximal signal intensitytime ratio (MITR); definitions of these features are given in the worksheets in Appendix E2 (radiology.rsnajnls.org/cgi/content/full/2393042007/DC1). Thus, an interpretation based on integrated morphologic and quantitative pharmacokinetic features was made to maximize specificity while preserving clinically acceptable sensitivity. The cutoff points chosen in the scoring system were to be validated, as they are in the present analysis, by using a symptomatic cohort of patients.
An analysis to study reader performance by using the scoring system and the overall conclusions was performed previously (6). The sensitivity and specificity of our scoring system have already been published (6) and are based on analyses of the results of 1441 examinations performed in women at high risk for breast cancer and the results of 100 examinations performed in women treated at symptom clinics. Briefly, the sensitivity of the double reading of mammograms was 91% (95% confidence interval [CI]: 83%, 96%), and the specificity was 81% (95% CI: 79%, 83%). The purpose of our current study was to evaluate prospectively the accuracy of a lesion classification system designed for use in the United Kingdom MARIBS study.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Study Population
All participating patients provided written informed consent. The study received ethical approval from the North Thames Multicenter research ethics committee and the local research ethical committees of the 22 participating centers.
The data analyzed comprised those from 1541 breast MR imaging examinations performed in 738 women who were examined as part of the MARIBS study between August 6, 1997 (start of the study), and February 25, 2003. Each imaging study was read prospectively by two independent radiologists. A total of 44 radiologists (including R.M.L.W., F.J.G., and A.R.P.) from 22 centers in the United Kingdom participated in the study; a median of 36.5 studies (range, 1509) were read per radiologist. The reference standard was either the abnormal finding in any case in which cytologic or histologic results were available or a verified negative result of a minimum follow-up of 24 months. The women at high risk for breast cancer (n = 638) were monitored until the end of the study (March 31, 2004) at genetics clinics. Twenty-six breast cancers were detected in this high-risk group during the study period. Of 352 women who underwent one subsequent annual screening with MR imaging and/or mammography (32 women underwent MR imaging only, nine underwent mammography only, 311 underwent both examinations), five had cancer that was detected at screening with contrast materialenhanced MR imaging. Thirty-six women underwent two subsequent annual screenings with MR imaging and/or mammography (three women underwent MR imaging only, 33 underwent both examinations), and no cancers were detected.
Of the women who did not undergo subsequent imaging or who withdrew from the study, 67 had 12 years of follow-up and 157 had more than 2 years of follow-up. Most of these women will have been offered annual mammographic examinations outside the context of this study during the follow-up period. Two interval cancers arose in this group: one at 10 months and one at 27 months. No other breast cancers were reported.
The majority of the examinations (n = 1441) were performed in a screening cohort of 638 women aged 2451 years (mean age, 40.5 years) who were at high genetic risk for breast cancer; this group formed the high-risk cohort. Eligibility for this cohort was based on the following: carrying of a known BRCA1, BRCA2, or p53 mutation; being a first-degree relative of a known carrier of either mutation; or having a family history suggestive of a 50% or higher probability of being a carrier of either mutation (7). Symptomatic women in the high-risk cohort were excluded. The remaining 100 examinations were performed in 100 women recruited at the start of the study (not on the basis of family history) who had an indeterminate lesion at mammography and/or ultrasonography or a possible breast cancer for which a histologic diagnosis would be obtained at excision biopsy, core-needle biopsy, or fine-needle aspiration cytology and were attending a symptomatic breast treatment clinic. These women were aged 2381 years (mean, 49.2 years) and are herein referred to as the symptomatic cohort. Additional details about these patient cohorts have been published (6).
The pathologists (including S.R.L.) from all 22 centers either participated in the United Kingdom Breast Screening Program or analyzed specimens and participated in the United Kingdom Pathology and Cytology Quality Assurance Program. A pathologist (S.R.L.) reviewed the pathology reports to classify lesions as benign or malignant (see Appendix E3 [radiology.rsnajnls.org/cgi/content/full/2393042007/DC1]). At the time of this writing, it had been more than 24 months since the last MR imaging examination described herein was performed; thus, there has been adequate time for false-negative findings to emerge. There were 91 histologically proved cancers and 86 biopsy-proved histologically benign lesions.
For 307 lesions analyzed at additional examinations, the methods of validation were as follows: Of the benign lesions, 126 were diagnosed with further imaging; 36, with fine-needle aspiration cytology only; 23, with core-needle biopsy (five with MR imaging guidance); 12, with core-needle biopsy and surgical biopsy; and 19, with surgical biopsy. Of the malignant lesions, 46 were diagnosed with core-needle biopsy and surgery (19 of these lesions were analyzed at fine-needle aspiration cytology also), and 45 were diagnosed with surgery (20 of these lesions were analyzed at fine-needle aspiration cytology also).
MR Imaging Protocols
The MR imaging screening examination (protocol A) consisted of high-spatial-resolution (512 x 256 matrix) T1-weighted sequences performed before and after contrast medium injection, with two three-dimensional coronal acquisitions of lower spatial resolution (256 x 128 matrix) performed before the intravenous bolus injection of 0.2 mmol of gadopentetate dimeglumine (Magnevist; Schering, Berlin, Germany) per kilogram of body weight and four to six acquisitions performed immediately after the injection (Appendix E4 [radiology.rsnajnls.org/cgi/content/full/2393042007/DC1]) (2). An optional additional fat-suppressed T1-weighted high-spatial-resolution sequence performed approximately 8 minutes after the injection also was allowed. This combination of sequences enabled analysis of the signal intensitytime characteristics of any region in the imaging volume of either breast and morphologic assessment of high-spatial-resolution images.
Patients who were recalled because of an indeterminate MR imaging study that was deemed suspicious (Appendix E2 [radiology.rsnajnls.org/cgi/content/full/2393042007/DC1]) underwent either a high-temporal-resolution examination with 0.1 mmol/kg gadopentetate dimeglumine (protocol B, Appendix E4 [radiology.rsnajnls.org/cgi/content/full/2393042007/DC1]), with attention focused on the area of the breast where the abnormality was detected at the initial MR imaging screening examination (by preference, the protocol recommends that this second MR imaging examination be performed in the sagittal plane, but the radiologist may opt to use the coronal plane), or a repeat of the initial protocol A MR imaging screening examination with 0.2 mmol/kg gadopentetate dimeglumine. Either alternative examination was to be performed during a phase of the patient's menstrual cycle different from that during which the initial screening examination was performed. The supervising radiologist chose the diagnostic pathway, but in general, diffuse or multiple abnormalities were addressed by repeating protocol A and focal lesions were addressed by performing the high-temporal-resolution examination, which covered only the portion of the breast specified by the supervising radiologist.
Benign lesions were sampled at biopsy, where possible, rather than managed with surveillance, but the protocol allowed for a repeat protocol A examination at 6 months (8). Full details of the MR imaging protocol are given in Appendix E4 (radiology.rsnajnls.org/cgi/content/full/2393042007/DC1) and have been described previously (2).
Scoring System
Radiologists read the initial MR studies while blinded to other MR imaging, clinical, and mammographic findings. The scoring system used was based on five morphologic and dynamic contrast material uptake characteristics and derived from the literature (9) that was available before the start of the study. Findings were recorded prospectively on standard worksheets (Appendix E2 [radiology.rsnajnls.org/cgi/content/full/2393042007/DC1]), which were developed to ensure consistency in the choice and analysis of regions of interest. Although a reading radiologist may have analyzed multiple regions of interest in a breast, only the highest scoring lesion was included in the analysis. Since this was a multicenter study, the data are those from examinations performed at two field strengths (1.0 and 1.5 T) on MR imaging units from four different manufacturers (GE Medical Systems, Milwaukee, Wis; Siemens, Erlangen, Germany; Philips Medical Systems, Eindhoven, the Netherlands; Picker, Cleveland, Ohio).
The two elements of the scoring system that were intended to reflect the morphologic characteristics of the lesion were morphology (MOR)that is, whether the lesion was lobulated or well defined, poorly defined, or spiculatedand pattern of enhancement (POE)that is, whether the enhancement was centrifugal, absent, minimal, or homogeneous; heterogeneous; or ring like. The pattern of contrast material washout (POCW) was a qualitative variable that indicated whether the signal intensity showed (a) a monotonic increase with time, (b) an increase followed by a plateau, or (c) an increase followed immediately by a decrease (ie, washout). The two quantitative variables were PMFE and MITR, the calculations for which are given in Figure E1 (radiology.rsnajnls.org/cgi/content/full/2393042007/DC1). The weightings of some variables were chosen on the basis of findings from the literature, and the weightings of other variables were obtained from a center in the United Kingdom and subsequently published (10). The total score for the five weighted variables was used to classify each lesion as malignant, suspicious, or benign.
Statistical Analyses
From a total of 1541 double-read MR imaging examinations performed in 738 women, the maximal number of breasts in which a lesion could have been analyzed was 3082, in 2091 of which a lesion was not analyzed by either reader. When a reader analyzed more than one lesion in a single breast, only the highest scoring lesion was considered in the analysis. For 550 breasts, just one reader analyzed a lesion, and for 126 breasts, both readers analyzed different lesions. Thus, in 315 breasts, both readers chose the same lesion to analyze.
The levels of between-reader agreement across different elements of the scoring system were evaluated by using
statistics and on the basis of the 315 MR imaging examinations at which the same lesion was judged by both readers to be the highest scoring (11). Weighted
values were used for outcomes with more than two levels, with the assumption of evenly spaced categories (12): A
value of less than or equal to 0.20 indicated poor agreement; 0.210.40, fair agreement; 0.410.60, moderate agreement; 0.610.80, substantial agreement; and greater than 0.80, almost perfect agreement.
For the analyses aimed at improving the scoring system, it was necessary to divide the data into a design set, which was used to design the new scoring system, and a testing set, with which the new system would be tested. From the complete set of 991 breasts in which a lesion was analyzed by one or both readers, 500 breasts were randomly sampled to create the testing set and the remaining 491 breasts served as the design set. The 114 lesion-containing breasts of the 100 women in the symptomatic cohort were evenly distributed between the two groups. To avoid the nonindependence of two readings of the same lesion, only one reading was used for each breast. When both readers analyzed a lesion in the same breast (441 breasts), the reader whose results were used was selected randomly.
Both continuous variablesPMFE and MITRwere separated into three categories for the purposes of the scoring system in the main MARIBS screening study, according to cutoff points chosen by using the best information available before the start of the study. The results suggested that this may not have been the most appropriate way of dividing the data, so the 33.3rd and 66.7th percentiles of the PMFE and MITR distributions in the design set were used to select more informative cutoff points and to investigate whether it would be more suitable to use MR unitspecific cutoff points.
With the original scoring system, only the absolute value of the MITR was used: [(SIpost SIpre) · 100]/T, where SIpost is the maximal signal intensity after the contrast material administration, SIpre is the average signal intensity before the contrast material administration, and T is the time between the contrast material administration and the time at which the maximal signal intensity after contrast material administration was achieved. As an alternative approach, we also considered whether the normalized MITRdefined as [(SIpost SIpre) · 100]/(T · SIpre)was a more useful measure.
Multiple logistic regression analysis with backward stepwise selection was used to determine the optimal combination of variables needed to predict a malignant lesion, and a new simplified scoring system was devised on the basis of these results. The sandwich variance estimator was used to allow for the nonindependence of examinations performed in the same woman (when lesions were analyzed either in both breasts or at more than one annual examination) (13). Statistical significance was set at the 5% level.
The discriminatory abilities of the different scoring system elements, of the new simplified scoring system, and of the existing system were evaluated by comparing the areas under the nonparametric receiver operating characteristic (ROC) curves (14). The asymptotic 95% CIs assumed a normal distribution for the area under the curve, with standard errors derived from the DeLong et al method (14). The performance of the new simplified scoring system was evaluated in the testing set of 500 breasts. The 95% CIs for sensitivity and specificity in the detection of malignancy were exact and based on the binomial distribution. All analyses were performed by using Stata, version 8.2, software (StataCorp, College Station, Tex).
For
statistic agreement analysis, we used only the 315 breasts for which both readers chose the same highest scoring lesion. To redefine the scoring system, we used only one reading for each breast. If only one reader analyzed a lesion in a particular breast, then that reading was used. If both readers analyzed lesions (regardless of whether they both chose the same highest scoring lesion), the lesion used in the analysis was randomly chosen from the two highest scoring lesions.
| RESULTS |
|---|
|
|
|---|
statistic for interreader agreement regarding the overall lesion outcome (ie, malignant, suspicious, or benign) was a moderate 0.56 (95% CI: 0.47, 0.65), or 0.58 (weighted) if the three outcome categories were considered separately. The levels of agreement between readers 1 and 2 regarding the morphologic parameters (
= 0.51 for MOR,
= 0.49 for POE) and the POCW (
= 0.53) were similar but slightly higher regarding the categorized MITR with use of the three original categories (
= 0.65). Reader agreement regarding the categorized PMFE with use of the three original categories was very poor (
= 0.12), but the choice of cutoff points led to all but 15 of the 630 readings scoring four points for this element of the scoring system.
Figure 1a shows the PMFE values calculated for the 491 breasts in the design set according to MR unit. By chance, none of the three examinations performed with the Picker machine was included in the design set. The distribution of values appeared to be broadly similar among the three MR units; therefore, it seemed reasonable to use the same categories for all machines. The values in the overall 33.3rd and 66.7th percentiles were 158% and 263%, respectively; thus, after rounding, the PMFE values for the three proposed new scoring categories were less than 160%, greater than or equal to 160% and less than 260%, and greater than or equal to 260%. These are very different from the original values: less than 40%, 40%60%, and greater than 60%, respectively. Changing the categories in this way substantially improved interreader agreement (weighted
= 0.49).
|
statistic for agreement regarding this variable was reduced (weighted
= 0.56).
|
statistic of 0.49.
|
In the design set, the median size for the 478 lesions of known size was 7 mm in diameter (range, 163 mm; interquartile range, 512 mm). With small lesions defined as those 10 mm or less in diameter, there were 346 small lesions, three of which were malignant, and 132 larger lesions, 39 of which were malignant.
Test Performance
The diagnostic accuracies of each element of the scoring system are illustrated by the ROC curves in Figure 2, which are based on the design set. The total score was more useful for distinguishing between malignant and nonmalignant lesions than any individual component of the scoring system. However, the area under the ROC curve for the total score was not significantly greater than that for the POE curve (P = .1); this result suggests that the POE was almost as useful a predictor of malignancy as the total score. The categorical PMFE and MITR values and the total score illustrated in Figure 2 are based on the new category cutoff points. Using the original category cutoff points did not cause a significant reduction in the area under the curve for the total score (area for original total score, 0.88; 95% CI: 0.83, 0.94; P = .9 for difference in area with use of old vs new category cutoff points) or the MITR (area for original value, 0.56; 95% CI: 0.49, 0.63; P = .1). However, use of the new definition caused a significant increase in the area under the curve for the PMFE (area for original value, 0.53; 95% CI: 0.52, 0.54; P = .03). The normalized MITR was a significantly better predictor than the absolute MITR (P = .005).
|
|
|
|
The new scoring system was tested on the 500 breasts in the testing set, and the area under the ROC curve for this set was very close to that for the design set. The new scoring system performed slightly better than the original scoring system in the testing set (Fig 3), although not significantly so (P = .2). However, the small gain in diagnostic ability was accompanied by a substantial reduction in computational complexity, since the new scoring system did not require the calculation of PMFE or MITR values. When the lesions that were assigned total scores of 5 or higher were described as malignant or suspicious, the sensitivity in the testing set was 84% (95% CI: 69.9%, 93.4%) and the specificity was 86.0% (95% CI: 82.4%, 89.0%) (Table 2). With use of the 315 lesions that were analyzed by both readers, the
statistic for this definition of malignant or suspicious lesion was moderate (0.60; 95% CI: 0.50, 0.69).
| DISCUSSION |
|---|
|
|
|---|
An important consideration in evaluating the suitability of breast MR imaging for screening is the repeatability of the observations that lead to the diagnosis. In the current study, almost all levels of interreader agreement regarding the individual elements of the scoring system could be described as moderate (10), and even the new scoring system yielded a
statistic of 0.60. It must be recognized that diagnoses made with breast MR imaging are sufficiently imprecise such that the allocation of diagnostic categories involves the use of terms with which only moderate agreement can be achieved, and this fact applies to both descriptive and calculated numerical observations and detailed and aggregated features. There is clearly a subjective element to the described MR imaging screening technique despite attempts to minimize it. The double reading of all breast MR imaging studies was the practice throughout the MARIBS study.
The analyses related to lesion size are interesting. We found the POE to be a more useful discriminator of lesions smaller than 10 mm in diameter. It is interesting that the frequency of small lesions that were analyzed but found to be cancers was low. This finding is not reflected in a paucity of small cancers detected with MR imaging in our main study.
By chance, our case material included no histologic lesions with atypia. Histologic atypias have been a source of false-positive findings (in other studies) that was not represented in our present analyses (15). With the new simplified scoring system, integer scores were assigned for spiculated MOR, heterogeneous POE, ringlike POE, and POCW types 2 and 3, with a summed score higher than 5 indicating a malignant or suspicious lesion.
With the new simplified scoring system, the sensitivities in both the design set and the testing set were within the range of the previously published single-reader sensitivities in the main MARIBS study (80% for reader 1, 89% for reader 2), but the specificities in both sets were lower than the 88% specificity previously reported for both reader 1 and reader 2. However, this was more a consequence of the different data sets than a consequence of the behavior of the scoring system itself, as can be seen from the comparisons between the new simplified and original scoring systems. Because the design and testing sets described herein included only breasts with lesions that radiologists had decided to analyze, the study was necessarily enriched for interesting observations, as opposed to the main study, which included also all the breasts that did not have analyzed lesions.
At the design stage of the study protocol, there was debate within the study advisory group (members listed in Appendix E1 [radiology.rsnajnls.org/cgi/content/full/2393042007/DC1]) and in the literature regarding the precise method of calculating the parameter that included both the peak signal intensity and the time to achieve it. The MITR was chosen because it was the preferred measure reported in the literature at the time. The normalized MITR takes into account the signal intensity before the contrast material injection and thus includes adjustments for MR unitrelated factors, and the use of this parameter eliminates the problem of a lack of uniform signal intensity related to the site within the magnet. We found that the normalized MITR is a more informative measure than the absolute MITR and has the advantage of having the same distribution on all MR unit types. Nevertheless, it does not appear to be an important independent predictor of malignancy. Interestingly, the variables that required the most computational effort on the part of radiologists (PMFE, MITR, and normalized MITR) apparently were also the least diagnostically useful. The proposed new scoring system should therefore be quicker, easier to perform, and less prone to calculation errors while maintaining sensitivity in lesion detection and characterization.
The described analyses were performed after the full publication of the American College of Radiology BI-RADS breast MR imaging lexicon (3), which is designed to be a common format for reporting breast MR imaging data. The introductory statement in the published document recognizes the existing diversity of interpretations and the need for a more uniform approach. A comparison of the conclusions of our scoring system analysis with the data cited in the first (2003) edition of the published BI-RADS lexicon revealed that our study and the extensive work on the design and testing of the MR lexicon in BI-RADS yield similar conclusions. Our scoring system included a very simple descriptive categorization that was intended to yield a simple integer score. Since our radiologists were extensively involved in mammography, it is likely that the morphologic features familiar to them from mammography guided the use of the simple MOR score requested on our forms. POE and washout characteristics are an essential part of the BI-RADS lexicon report. Our study results show that these are important breast lesion discriminators. On the other hand, numerical scores and calculations of the contrast dynamics are discarded in the BIRADS lexicon, and the associated text recognizes the complexity of the literature and the local importance of these features in some medical centers. Our results therefore independently support the approach used by the authors of the BI-RADS lexicon.
Our study had limitations. Our initial plan was to use the 100 images from the symptomatic cohort to design an improved scoring system, which would then be tested in the early cases in the screening study. This proved to be impractical, however, largely because the age distribution of the recruited symptomatic cohort was not representative of the age distribution of the screening cohort. There also may have been cancers with different morphologic characteristics. The revised scoring system could not be based purely on data from the screening cohort because the small number of malignant cases yielded little information about the sensitivity of the system. Thus, the material that we used was a combination of lesions from screening cases (n = 877) and lesions from symptomatic cases (n = 114).
While it was often the case that a radiologist identified and scored multiple lesions in the same breast, the analysis described herein was based on only the highest scoring lesions. This was reasonable for the main analysis, since in practice the maximal score determines whether a woman will be recalled for further examination. However, in the present study, this meant that the
statistics for interreader agreement could be based on only those breasts in which both readers identified the same lesion as the highest scoring or the only lesion in the given breast. Thus, it is possible that the reported
values were biased in favor of the more unambiguously high-scoring lesions.
In conclusion, the scoring system designed at the outset of the MARIBS study was validated and effective. However, it could be improved in minor waysnamely, by changing the weightings and the cutoff points. The quantitative parameters PMFE and MITR were shown to be less useful compared with three qualitative descriptors: MOR, POE, and POCW. The kinetic calculation shown to be the most discriminative was the normalized MITR, which can contribute to an effective lesion-scoring system. Results of the current analyses therefore suggest that a simple integer-based calculation of lesion morphologic features, heterogeneity, and kinetic curve shapes can be used effectively in the context of breast MR imaging screening.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Abbreviations: BI-RADS = Breast Imaging Reporting and Data System CI = confidence interval MARIBS = Magnetic Resonance Imaging in Breast Screening MITR = maximal signal intensitytime ratio MOR = morphology PMFE = percentage of maximal focal enhancement POCW = pattern of contrast material washout POE = pattern of enhancement ROC = receiver operating characteristic
Author contributions: Guarantor of integrity of entire study, M.O.L.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; manuscript final version approval, all authors; literature research, R.M.L.W., L.J.P.; clinical studies, R.M.L.W., F.J.G., A.R.P.; statistical analysis, D.T., D.F.E.; and manuscript editing, R.M.L.W., D.T., L.J.P., F.J.G., A.R.P.
Authors stated no financial relationship to disclose.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. A. Harvey, R. E. Hendrick, J. M. Coll, B. T. Nicholson, B. T. Burkholder, and M. A. Cohen Breast MR Imaging Artifacts: How to Recognize and Fix Them RadioGraphics, October 1, 2007; 27(suppl_1): S131 - S145. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |