What’s Unique about Rater Agreement?
From one point of view, nothing. Rater agreement studies tend to
produce r X r tables, where r is the number of rating categories. As
such, the models that test marginal homogeneity and symmetry are quite
relevant, asking does one rater score similarly (or higher) than
another. However, rater agreement data are different in that there are
usually a surplus of responses on the diagonal, indicating that the
raters agree. Often it makes sense to explicitly model this excess, and
the models in this vignette do that.
Another feature of these models is that they are nearly all regular
log-linear models. As such they are less interesting, in that they can
be fit with standard log-linear estimation proceudres, including glm()
with Poisson family and “log” as the link function. Where things can be
a bit challenging is that one has to explicitly form the design matrix.
For some designs this is a bit of a puzzle, and there are some utility
routines to help in the specification.
Much of the modeling in this section comes from chapter 2 of von Eye,
A. & Mun, E. Y. (2005), Analyzing rater agreement: Manifest variable
methods. Mahwah, NJ: Lawrence Erlbaum.
 
The basic main effects model
The baseline model for rater modeling is the main effect model, which
specifies an intercept and a row (rater 1) effect and a column (rater 2
effect), but no interaction terms. This is the independence model that
underlies Cohen’s kappa for nominal items: p_a - p_c kappa = ———– 1 -
p_c
dogs_kappa <- kappa(dogs)
print(dogs_kappa)
#> $kappa
#> [1] 0.6351767
#> 
#> $se
#> [1] 0.05126776
p_c is taken as p_i+ times p_+j, which is the model prediction for
all cells in the main effects model. There is also a weighted version of
kappa available, weighted_kappa(n, w) where w is the matrix of weights.
The unweighted kappa for the dogs data is 0.6351767, 0.0512678,
indicating good agreement betweeen the two clinicians.
There are two models that specifically include kappa as part of the
model. The first is by Agresti, 1989, and the second is by Schuster,
2001. The Agresti model is described as “symmetry plus
quasi-independence.” It assumes marginal homogeneity, as well as
assuming that quasi-symmetry holds.
The Agresti kappa model is available as:
result <- Agresti_kappa_agreement(dogs)
The result is a list with several elements. The first is kappa,
0.6235352, which is quite similar to the observed value. Estimates of
the marginal parameters (the moodel assumes that these fit to both
raters) are also returned, 0.6316093, 0.2278695, 0.1316074, 0.0146445.
The Pearson chi-squared statistic of 92.3636704 on
result$df degrees of freedom indicates that the model does
not fit well to this data.
The Schuster model also assumes marginal homogeneity. It incorporates
an underlying symmetry matrix as well. The model is much harder to fit
acceptably, as reflected in the amount of time and number of iterations
that the model runs. It does not yield a plausible result for the dogs
data (not shown: the log(likelihood) decreased, the estimated kappa was
negative and the residuals were remarkably large for the highest
category), so we apply it to the vision_data.
result <- Schuster_symmetric_rater_agreement_model(vision_data)
#> 100      -10365.131345958      5669.64113345708      114810.134161738     0.000219059130695239
#> 200      -10365.0027942057      5669.38402995252      114808.82366675     0.000231464328770828
#> 300      -10364.8743346158      5669.12711077268      114807.513696302     0.000243860940594987
#> 400      -10364.7459671253      5668.87037579168      114806.204250146     0.000256248971591903
#> 500      -10364.6176916713      5668.61382488376      114804.895328033     0.000268628427181174
#> 600      -10364.4895081911      5668.35745792331      114803.586929717     0.000280999312775702
#> 700      -10364.3614166219      5668.10127478488      114802.27905495     0.000293361633782037
#> 800      -10364.233416901      5667.84527534316      114800.971703486     0.00030571539560073
#> 900      -10364.1055089659      5667.58945947302      114799.664875078     0.000318060603625277
#> 1000      -10363.9776927541      5667.33382704943      114798.35856948     0.000330397263244574
#> 1100      -10363.8499682032      5667.07837794755      114797.052786446     1.23196687967666e-07
#> 1200      -10363.7223352507      5666.82311204265      114795.747525731     1.23109851597136e-07
#> 1300      -10363.5947938345      5666.56802921021      114794.442787091     1.23023071325962e-07
#> 1400      -10363.4673438923      5666.31312932578      114793.138570279     1.22936348212369e-07
#> 1500      -10363.339985362      5666.05841226513      114791.834875051     1.22849681910455e-07
#> 1600      -10363.2127181815      5665.80387790414      114790.531701165     1.22763072074297e-07
#> 1700      -10363.0855422888      5665.54952611882      114789.229048374     1.22676518357946e-07
#> 1800      -10362.9584576221      5665.29535678538      114787.926916437     1.22590021468601e-07
#> 1900      -10362.8314641195      5665.04136978014      114786.625305109     1.22503581411367e-07
#> 2000      -10362.7045617192      5664.78756497956      114785.324214149     1.22417197489217e-07
#> 2100      -10362.5777503596      5664.53394226028      114784.023643312     1.22330869356149e-07
#> 2200      -10362.451029979      5664.28050149906      114782.723592358     1.22244598070435e-07
#> 2300      -10362.3244005158      5664.02724257282      114781.424061045     1.22158383286079e-07
#> 2400      -10362.1978619087      5663.7741653586      114780.125049129     1.22072223954901e-07
#> 2500      -10362.0714140962      5663.52126973364      114778.826556372     1.21986120433009e-07
#> 2600      -10361.945057017      5663.26855557526      114777.52858253     1.2190007342761e-07
#> 2700      -10361.8187906099      5663.01602276097      114776.231127364     1.21814082241564e-07
#> 2800      -10361.6926148136      5662.7636711684      114774.934190634     1.21728146879887e-07
#> 2900      -10361.5665295671      5662.51150067536      114773.637772098     1.21642267347591e-07
#> 3000      -10361.4405348093      5662.25951115977      114772.341871519     1.21556443298574e-07
#> 3100      -10361.3146304793      5662.00770249969      114771.046488655     1.21470674737823e-07
#> 3200      -10361.1888165161      5661.75607457337      114769.751623269     1.21384962372551e-07
#> 3300      -10361.063092859      5661.50462725915      114768.45727512     1.21299305505521e-07
#> 3400      -10360.9374594472      5661.25336043552      114767.163443972     1.21213703790579e-07
#> 3500      -10360.81191622      5661.00227398116      114765.870129585     1.21128157583805e-07
#> 3600      -10360.6864631168      5660.75136777484      114764.577331722     1.21042666539027e-07
#> 3700      -10360.5611000772      5660.50064169551      114763.285050146     1.20957231012324e-07
#> 3800      -10360.4358270405      5660.25009562225      114761.993284618     1.2087185030636e-07
#> 3900      -10360.3106439466      5659.99972943429      114760.702034902     1.20786525128346e-07
#> 4000      -10360.1855507349      5659.74954301096      114759.411300762     1.2070125513207e-07
#> 4100      -10360.0605473453      5659.49953623179      114758.121081961     1.20616039620142e-07
#> 4200      -10359.9356337176      5659.24970897641      114756.831378264     1.20530879299768e-07
#> 4300      -10359.8108097917      5659.00006112462      114755.542189433     1.20445774175857e-07
#> 4400      -10359.6860755076      5658.75059255634      114754.253515235     1.2036072355098e-07
#> 4500      -10359.5614308052      5658.50130315166      114752.965355434     1.20275727781178e-07
#> 4600      -10359.4368756248      5658.25219279077      114751.677709794     1.20190786520158e-07
#> 4700      -10359.3124099064      5658.00326135403      114750.390578083     1.20105900123961e-07
#> 4800      -10359.1880335904      5657.75450872194      114749.103960065     1.20021068597455e-07
#> 4900      -10359.063746617      5657.50593477511      114747.817855507     1.1993629089194e-07
#> 5000      -10358.9395489266      5657.25753939434      114746.532264175     1.19851568065815e-07
#> 5100      -10358.8154404597      5657.00932246053      114745.247185836     1.19766899421538e-07
#> 5200      -10358.6914211568      5656.76128385471      114743.962620257     1.19682285315132e-07
#> 5300      -10358.5674909585      5656.5134234581      114742.678567205     1.19597725751429e-07
#> 5400      -10358.4436498054      5656.26574115201      114741.395026448     1.19513219681625e-07
#> 5500      -10358.3198976384      5656.01823681793      114740.111997754     1.19428767812928e-07
#> 5600      -10358.1962343981      5655.77091033746      114738.829480891     1.19344370501356e-07
#> 5700      -10358.0726600256      5655.52376159232      114737.547475628     1.19260026698053e-07
#> 5800      -10357.9491744616      5655.27679046442      114736.265981733     1.19175737461463e-07
#> 5900      -10357.8257776473      5655.02999683577      114734.984998975     1.19091501742693e-07
#> 6000      -10357.7024695237      5654.78338058853      114733.704527125     1.19007319546498e-07
#> 6100      -10357.5792500319      5654.53694160501      114732.424565952     1.18923191931344e-07
#> 6200      -10357.4561191132      5654.29067976762      114731.145115225     1.18839117497042e-07
#> 6300      -10357.3330767089      5654.04459495893      114729.866174716     1.18755096599565e-07
#> 6400      -10357.2101227603      5653.79868706168      114728.587744195     1.18671129243649e-07
#> 6500      -10357.0872572088      5653.5529559587      114727.309823432     1.18587215785277e-07
#> 6600      -10356.9644799959      5653.30740153294      114726.032412199     1.18503355526666e-07
#> 6700      -10356.8417910632      5653.06202366755      114724.755510268     1.18419548823787e-07
#> 6800      -10356.7191903523      5652.81682224577      114723.47911741     1.18335795681353e-07
#> 6900      -10356.5966778049      5652.57179715098      114722.203233397     1.18252095401528e-07
#> 7000      -10356.4742533628      5652.32694826673      114720.927858003     1.18168448340266e-07
#> 7100      -10356.3519169677      5652.08227547665      114719.652990999     1.18084854150974e-07
#> 7200      -10356.2296685617      5651.83777866455      114718.378632158     1.18001313892171e-07
#> 7300      -10356.1075080866      5651.59345771435      114717.104781254     1.17917825812115e-07
#> 7400      -10355.9854354845      5651.34931251012      114715.831438061     1.1783439132061e-07
#> 7500      -10355.8634506974      5651.10534293605      114714.558602351     1.1775100971974e-07
#> 7600      -10355.7415536677      5650.86154887648      114713.2862739     1.17667680662851e-07
#> 7700      -10355.6197443373      5650.61793021587      114712.014452482     1.17584404505878e-07
#> 7800      -10355.4980226488      5650.37448683883      114710.743137871     1.17501180902149e-07
#> 7900      -10355.3763885445      5650.13121863007      114709.472329843     1.17418010207597e-07
#> 8000      -10355.2548419667      5649.88812547448      114708.202028173     1.17334892075532e-07
#> 8100      -10355.1333828579      5649.64520725705      114706.932232636     1.17251826510565e-07
#> 8200      -10355.0120111609      5649.40246386291      114705.662943009     1.17168813517298e-07
#> 8300      -10354.8907268181      5649.15989517733      114704.394159067     1.17085852749002e-07
#> 8400      -10354.7695297723      5648.91750108571      114703.125880587     1.17002944912928e-07
#> 8500      -10354.6484199662      5648.67528147357      114701.858107346     1.16920088608325e-07
#> 8600      -10354.5273973427      5648.43323622659      114700.590839121     1.16837285245109e-07
#> 8700      -10354.4064618447      5648.19136523054      114699.324075689     1.16754533422487e-07
#> 8800      -10354.2856134151      5647.94966837136      114698.057816827     1.16671834550397e-07
#> 8900      -10354.164851997      5647.70814553511      114696.792062314     1.16589186876651e-07
#> 9000      -10354.0441775334      5647.46679660797      114695.526811928     1.16506591811197e-07
#> 9100      -10353.9235899675      5647.22562147626      114694.262065448     1.16424048304507e-07
#> 9200      -10353.8030892426      5646.98462002644      114692.997822651     1.16341557063831e-07
#> 9300      -10353.682675302      5646.74379214508      114691.734083317     1.16259117390967e-07
#> 9400      -10353.5623480889      5646.50313771889      114690.470847225     1.16176729993171e-07
#> 9500      -10353.4421075468      5646.26265663472      114689.208114156     1.16094393820835e-07
#> 9600      -10353.3219536192      5646.02234877954      114687.945883888     1.16012109581209e-07
#> 9700      -10353.2018862496      5645.78221404045      114686.684156202     1.15929876576027e-07
#> 9800      -10353.0819053818      5645.54225230466      114685.422930878     1.15847695512544e-07
#> 9900      -10352.9620109592      5645.30246345957      114684.162207698     1.15765566043863e-07
#> 10000      -10352.8422029257      5645.06284739264      114682.901986442     1.15683487471663e-07
The iterations are slow (about 1 sec per 100 iterations). It runs for
the full 10,000 iterations specified by default, in spite of using the
Newton-Raphson method (it is likely that the performance of this
function will be improved in future versions of the package). At every
100 cycles the function outputs the iteration number, log(likelihood),
G^2 and X^2 and the convergence value, whihc is the relative improvement
in log(likelihood) value. For the vision data the observed kappa is
0.5953888, 0.0070393. The model’s final estimate is close to this,
0.5953852. It is a bit hard to evaluate the model fit, given that the
iterations did not converge, but the G^2 values that were printed out
indicate very poor fit to the data (the X^2 values were even more
extreme). Running to convergence of 1.0e-9 requires 244,503 iterations.
The final kappa estimate of 0.5953344 is very close to the observed
kappa of 0.5953888, 0.0070393. The final model yields a G^2 of 5409.781
on 9 degrees of freedom.
We note that the Agresti model:
result <- Agresti_kappa_agreement(vision_data)
runs substantially more quickly and fits this data set substantially
better, with G^2 of 409.7778011 on 11 degrees of freedom, although it
does not fit in an absolute sense.
To show that the kappa-based models can fit, we look at the
budget_actual data. Fitting this model:
s_result <- Schuster_symmetric_rater_agreement_model(budget_actual)
#> 100      -149.012309151138      2.96905386545657      3.07236917550835     0.00790303078865798
#> 200      -149.010817704918      2.96607097301722      3.06975109518432     0.00791311886943114
#> 300      -149.009343620158      2.96312280349772      3.06716522785273     0.00792308971643707
#> 400      -149.007886692519      2.96020894821833      3.06461118553905     0.00793294470492557
#> 500      -149.00644672009      2.95732900335991      3.0620885849234     0.00794268519389665
#> 600      -149.005023503362      2.95448256990533      3.05959704728408     0.00795231252629077
#> 700      -149.0036168452      2.95166925358144      3.05713619844183     0.00796182802918456
#> 800      -149.002226550811      2.94888866480201      3.05470566870467     0.00797123301397594
#> 900      -149.000852427715      2.94614041861118      3.0523050928136     0.00798052877657195
#> 1000      -148.999494285723      2.94342413462768      3.04993410988885     0.00798971659757486
yields acceptable fit (G^2 = 2.9433971 on 5 degrees of freedom), as
does fitting the Agresti kappa model:
a_result <- Agresti_kappa_agreement(budget_actual)
which gives a G^2 of 6.6831842 on 5 degrees of freedom. Again, the
two kappa estimates, 0.2851063 and 0.2788086 are not that different than
the observed kappa of 0.2845944, 0.0619068.
It is worth noting that symmetry (which is assumed by both kappa
models) fits the budget_actual data:
stuart_result <- Stuart_marginal_homogeneity(budget_actual)
yields a X^2 of 1.0889064 on 2 degrees of freedom, indicating that
symmetry is plausible for this data.
In the final analysis, neither of these specialized kappa models does
a good job of describing the vision data. Both models assume marginal
homogeneity, and we know from the vignette “Checking Whether Margins are
(Stochastically) Ordered” that marginal homogeneity can be rejected for
the vision data. When marginal homogeneity was satisfied in the
budget_actual data, both models yielded acceptable fit. It appears that
for these models to fit, marginal homogeneity must be satisfied.
Conversely, to fit a wide variety of rater data sets acceptably, it
appears that a (kappa-based) model must allow for heterogeneous
margins.