Inter-rater Reliability - Limits of Agreement

Limits of Agreement

Another approach to agreement (useful when there are only two raters and the scale is continuous) is to calculate the differences between each pair of the two raters' observations. The mean of these differences is termed bias and the reference interval (mean +/- 1.96 x standard deviation) is termed limits of agreement. The limits of agreement provide insight into how much random variation may be influencing the ratings. If the raters tend to agree, the differences between the raters' observations will be near zero. If one rater is usually higher or lower than the other by a consistent amount, the bias (mean of differences) will be different from zero. If the raters tend to disagree, but without a consistent pattern of one rating higher than the other, the mean will be near zero. Confidence limits (usually 95%) can be calculated for both the bias and each of the limits of agreement.

Bland and Altman have expanded on this idea by graphing the difference of each point, the mean difference, and the limits of agreement on the vertical against the average of the two ratings on the horizontal. The resulting Bland–Altman plot demonstrates not only the overall degree of agreement, but also whether the agreement is related to the underlying value of the item. For instance, two raters might agree closely in estimating the size of small items, but disagree about larger items.

When comparing two methods of measurement it is not only of interest to estimate both bias and limits of agreement between the two methods (inter-rater agreement), but also to assess these characteristics for each method within itself (intra-rater agreement). It might very well be that the agreement between two methods is poor simply because one of the methods has wide limits of agreement while the other has narrow. In this case the method with the narrow limits of agreement would be superior from a statistical point of view, while practical or other considerations might change this appreciation. What constitutes narrow or wide limits of agreement or large or small bias is a matter of a practical assessment in each case.

Read more about this topic:  Inter-rater Reliability

Famous quotes containing the words limits of, limits and/or agreement:

    Europe has what we do not have yet, a sense of the mysterious and inexorable limits of life, a sense, in a word, of tragedy. And we have what they sorely need: a sense of life’s possibilities.
    James Baldwin (1924–1987)

    An art whose limits depend on a moving image, mass audience, and industrial production is bound to differ from an art whose limits depend on language, a limited audience, and individual creation. In short, the filmed novel, in spite of certain resemblances, will inevitably become a different artistic entity from the novel on which it is based.
    George Bluestone, U.S. educator, critic. “The Limits of the Novel and the Limits of the Film,” Novels Into Film, Johns Hopkins Press (1957)

    The methodological advice to interpret in a way that optimizes agreement should not be conceived as resting on a charitable assumption about human intelligence that might turn out to be false. If we cannot find a way to interpret the utterances and other behaviour of a creature as revealing a set of beliefs largely consistent and true by our standards, we have no reason to count that creature as rational, as having beliefs, or as saying anything.
    Donald Davidson (b. 1917)