|
|
||||||||
Letters to the Editor |
Values as Additional Indicators to Measure Observer Agreement [letter]
Department of Diagnostic Radiology, Yan Chai Hospital, 711 Yan Chai Street, Tsuen Wan, Hong Kong Special Administrative Region, China*
Kelvin K. W. Yau, PhD,
Department of Management Sciences, City University of Hong Kong, Hong Kong Special Administrative Region, China
Bernard P. L. Chan, MRCP
Department of Medicine, National University Hospital, Singapore
. e-mail: kfmakhk@netvigator.com
Editor:
We read with much interest the article by Drs Kundel and Polansky in the August 2003 issue of Radiology (1), in which the
statistic was used to measure observer agreement in various screening and diagnostic imaging studies. As highlighted, in the presence of low disease prevalence, the
statistic should be interpreted with caution (2), and indexes of positive and negative agreement should also be reported (3). In addition, readers should be aware of the effect that bias between observers has on
values, where bias effect refers to the difference in observer assessment of the frequency occurrence, and prevalence effect refers to the difference between the probability of "positive" versus "negative" categories (4). In general,
values increase with an increased bias, and low prevalence results in a decrease in
values. In most cases, bias is not a major problem. However, the low prevalence level results in a substantial reduction in
values, which can be misleading.
Specific concern over the use of
values lies in the maximum attainable value, and hence, the underestimation of the strength of agreement for various ranges of
values according to Landis and Koch (5). As an illustration, consider the Table, which is a hypothetical 2 x 2 table.
|
value is found to be 0.375, so no matter how good the agreement between the observers, the strength of agreement is always "fair" according to the classification of Landis and Koch (5).
Our recent work in a computed tomographic study (6) reveals a similar problem caused by low disease prevalence when measuring rater agreement. The use of prevalence-adjusted bias-adjusted
values could be useful additionalindicators for measuring observer agreement (4). This is an adjusted measure that aims to alleviate the effect of bias and prevalence on
values. In particular, this method can be shown to relate to the observed proportion of agreement with the maximum attainable value always equal to +1. The adjusted
value (prevalence-adjusted bias-adjusted
value) in the previous example now becomes 0.6, which indicates "moderate" agreement according to the Landis and Koch classification (5).
Our experience has shown that the use of prevalence-adjusted bias-adjusted
values as additional agreement measures is especially important for the screening and diagnostic studies in the community setting, when a control for prevalence cannot be achieved easily. This is well exemplified by a previous study (7) on the interobserver reliability of chest radiographs in community-acquired pneumonia. The authors asserted that the
statistics in their study were lowered artifactually because of the presence of either high-prevalence findings (such as presence of infiltrate, alveolar vs interstitial) or low-prevalence findings (such as adenopathy or postobstructive pneumonia). This penalty imposed by the
coefficient because of an "unbalanced" challenge population was discussed thoroughly by Cicchetti and Feinstein (3).
We appreciate the authors efforts in reviewing this very important and useful statistical concept in diagnostic imaging studies.
REFERENCES
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |