When to Use Scott’s π or Krippendorff's α, If Ever?

Page **14** of **36**

that Krippendorff's *c*

*a*

is positively correlated with *N*: given distribution, bigger *N* leads to higher *c*

*a*

! *c*

*a*

is

the bar that *a*

*r*

* *must pass to produce a good looking α (cf. Paradox 7). Higher *c*

*a*

means less chance for α

to look good. But why? Bigger *N* means more cases coded hence higher replicability. How can α be a

general indicator of reliability when it systematically punishes replicability?

-------------------------------

Table 2 and Figure 4 about here

-------------------------------

** **

**Paradox 18**: *Totally random coding not totally unreliable? * Suppose two coders code four cases

completely by flipping coins. The coins behave exactly as probability theory says most likely to happen -

- head-head, tail-tail, head-tail, and tail-head, with *a*

*r*

=0.5, *N*=4. As one might expect, most of the

reliability indicators, including Scott's π and Cohen's κ, are exactly 0.00. Krippendorff's α, however,

stands out at 0.125. It's not a spectacular number. But still much higher than zero. Why? How can a

completely random result from a completely random process be anything but totally unreliable?

Further, this α=0.125, from *a*

*r*

=.5, *N*=4 and a totally random coding, is better than α=0.095 from

two Krippendorff examples, each having *a*

*r*

=.6, *N*=10 and honest coding (Krippendorff, 1980, pp. 133-

135; 2007, pp. 2-3). Again, how can more and better agreement be less reliable?

These additional paradoxes are additional evidences that α cannot be a general indicator of

reliability. Scott’s π and Krippendorff’s α might be useful only within a certain boundaries, beyond

which the paradoxes would arise. The following sections will define these boundaries, and test their

validity by applying them to resolve the paradoxes.

**V. Assumptions and Implications **

To explain chance agreement, methodologists (Krippendorff, 1980, pp. 133-134; Riffe et al., 1998,

pp. 129-130) talked about two coders drawing from urns with black and white marbles. If both draw black,