[14] Fleiss`s[15]:218 equally arbitrary guidelines characterize kappas from over 0.75 as excellent, 0.40 to 0.75 as just right, and below 0.40 as bad. Kalantri et al. investigated the accuracy and reliability of pallor as a detection tool for anemia. [5] They concluded that “clinical evaluation of pallor may exclude severe anemia and decide modestly.” However, the correspondence between observers for the detection of pallor was very poor (kappa = 0.07 for connective blues and 0.20 for tongue blues), meaning that pallor is an unreliable sign for the diagnosis of anemia. Krippendorffs alpha[16][17] is a versatile statistic that evaluates the concordance between observers who classify, evaluate, or measure a certain amount of objects relative to the values of a variable. It generalizes several specialized conformity coefficients by accepting any number of observers, applicable to nominal, ordinal, interval and proportional levels, capable of processing missing data and being corrected for small sample sizes. In addition, Cohens` (1960) criticism of po can be considered: even in hypothetical evaluators who, by chance, guess each case according to probabilities corresponding to the observed base rates, it can be high. In this example, if both evaluators simply suspected “positively” the vast majority of times, they would normally agree on the diagnosis. Cohen proposed to remedy this by comparing po with a corresponding amount, pc, the share of consent expected by random reviewers.