Enter the counts for each scenario in the 2×2 Confusion Matrix below.
(Assumes 2 Raters and 2 Categories, e.g., Yes/No)
Rater B
Yes / Positive
No / Negative
Rater A
Yes / Positive
No / Negative
Total Observations (N):–
Observed Agreement (Po):–
Expected Agreement (Pe):–
Cohen's Kappa (κ):–
How to Calculate Inter-Rater Reliability
Inter-Rater Reliability (IRR) is a critical statistical measure used to assess the degree of agreement between different people (raters) assessing the same qualitative data. While percent agreement is a simple metric, it fails to account for the possibility of raters agreeing purely by chance. This is where Cohen's Kappa becomes the standard for robust analysis.
Why Use Cohen's Kappa?
Cohen's Kappa (κ) is robust because it compares the Observed Agreement ($P_o$) against the Expected Agreement ($P_e$) that would occur by random chance. A value of 1 implies perfect agreement, while a value of 0 implies agreement no better than random chance.
The Logic Behind the Calculation
To calculate Kappa manually or in Excel, you first organize your data into a confusion matrix (as seen in the calculator above). The formula is:
κ = (Po – Pe) / (1 – Pe)
Observed Agreement ($P_o$): The proportion of times raters actually agreed. Formula: (A_Yes_B_Yes + A_No_B_No) / Total
Expected Agreement ($P_e$): The probability that they would agree by chance, based on each rater's individual frequency of saying "Yes" or "No".
How to Calculate in Excel
While the calculator above is instant, you may need to process raw data in Excel. Here is a step-by-step method assuming you have Rater A in Column A and Rater B in Column B:
Create a Pivot Table or Confusion Matrix: Use the COUNTIFS function to fill a 2×2 grid.
Cell AA (Both Yes): =COUNTIFS(A:A, "Yes", B:B, "Yes")
Cell AB (A Yes, B No): =COUNTIFS(A:A, "Yes", B:B, "No")
Cell BA (A No, B Yes): =COUNTIFS(A:A, "No", B:B, "Yes")
Sum the Rows and Columns: Calculate the marginal totals for each rater.
Calculate Expected Probabilities: Multiply the marginal probability of "Yes" for Rater A by the marginal probability of "Yes" for Rater B (and repeat for "No").
Apply the Kappa Formula: Use the standard formula $(P_o – P_e) / (1 – P_e)$.
Interpreting Your Kappa Score
Once you have your result, use the table below (Landis and Koch scale) to understand the strength of the agreement:
Kappa Value
Interpretation
< 0.00
Poor (Less than chance)
0.01 – 0.20
Slight Agreement
0.21 – 0.40
Fair Agreement
0.41 – 0.60
Moderate Agreement
0.61 – 0.80
Substantial Agreement
0.81 – 1.00
Almost Perfect Agreement
function calculateKappa() {
// Get input values
var aa = parseInt(document.getElementById('cellAA').value) || 0; // Both Yes
var ab = parseInt(document.getElementById('cellAB').value) || 0; // A Yes, B No
var ba = parseInt(document.getElementById('cellBA').value) || 0; // A No, B Yes
var bb = parseInt(document.getElementById('cellBB').value) || 0; // Both No
// Calculate Totals
var totalN = aa + ab + ba + bb;
if (totalN === 0) {
alert("Please enter at least one value in the matrix.");
return;
}
// Observed Agreement (Po)
// (Agreed Yes + Agreed No) / Total
var observedAgreement = (aa + bb) / totalN;
// Expected Agreement (Pe) calculations
// Marginal totals
var totalAYes = aa + ab;
var totalANo = ba + bb;
var totalBYes = aa + ba;
var totalBNo = ab + bb;
// Probabilities
var probAYes = totalAYes / totalN;
var probANo = totalANo / totalN;
var probBYes = totalBYes / totalN;
var probBNo = totalBNo / totalN;
// Probability of chance agreement for Yes and No
var chanceYes = probAYes * probBYes;
var chanceNo = probANo * probBNo;
var expectedAgreement = chanceYes + chanceNo;
// Calculate Kappa
var kappa = 0;
if (expectedAgreement === 1) {
kappa = 1; // Perfect agreement in marginals logic usually handles edge cases, but avoid div by zero
} else {
kappa = (observedAgreement – expectedAgreement) / (1 – expectedAgreement);
}
// Display Results
document.getElementById('resultArea').style.display = 'block';
document.getElementById('totalN').innerText = totalN;
document.getElementById('obsAgreement').innerText = (observedAgreement * 100).toFixed(2) + "%";
document.getElementById('expAgreement').innerText = (expectedAgreement * 100).toFixed(2) + "%";
document.getElementById('kappaScore').innerText = kappa.toFixed(3);
// Interpretation Logic
var interpBox = document.getElementById('interpretation');
var interpText = "";
var bgColor = "";
var textColor = "";
if (kappa < 0) {
interpText = "Poor Agreement (Less than chance)";
bgColor = "#ffebee"; textColor = "#c62828";
} else if (kappa <= 0.20) {
interpText = "Slight Agreement";
bgColor = "#fff3e0"; textColor = "#ef6c00";
} else if (kappa <= 0.40) {
interpText = "Fair Agreement";
bgColor = "#fff8e1"; textColor = "#f9a825";
} else if (kappa <= 0.60) {
interpText = "Moderate Agreement";
bgColor = "#e3f2fd"; textColor = "#1565c0";
} else if (kappa <= 0.80) {
interpText = "Substantial Agreement";
bgColor = "#e8f5e9"; textColor = "#2e7d32";
} else {
interpText = "Almost Perfect Agreement";
bgColor = "#00695c"; textColor = "#ffffff";
}
interpBox.innerText = interpText;
interpBox.style.backgroundColor = bgColor;
interpBox.style.color = textColor;
}