.irr-container { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif; max-width: 800px; margin: 20px auto; padding: 25px; border: 1px solid #e1e1e1; border-radius: 8px; background-color: #ffffff; box-shadow: 0 4px 6px rgba(0,0,0,0.05); } .irr-title { color: #2c3e50; text-align: center; margin-bottom: 30px; } .irr-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 20px; margin-bottom: 25px; } .irr-input-group { display: flex; flex-direction: column; } .irr-label { font-weight: 600; margin-bottom: 8px; color: #34495e; font-size: 0.95rem; } .irr-input { padding: 12px; border: 2px solid #ddd; border-radius: 6px; font-size: 1rem; transition: border-color 0.3s; } .irr-input:focus { border-color: #3498db; outline: none; } .irr-button { background-color: #3498db; color: white; border: none; padding: 15px 30px; border-radius: 6px; font-size: 1.1rem; font-weight: bold; cursor: pointer; width: 100%; transition: background-color 0.3s; } .irr-button:hover { background-color: #2980b9; } .irr-result-box { margin-top: 30px; padding: 20px; background-color: #f8f9fa; border-left: 5px solid #3498db; border-radius: 4px; display: none; } .irr-metric { display: flex; justify-content: space-between; padding: 10px 0; border-bottom: 1px solid #eee; } .irr-metric:last-child { border-bottom: none; } .irr-metric-label { font-weight: bold; color: #2c3e50; } .irr-metric-value { color: #2980b9; font-weight: bold; } .irr-interpretation { margin-top: 15px; font-style: italic; color: #7f8c8d; text-align: center; } .irr-article { margin-top: 40px; line-height: 1.6; color: #333; } .irr-article h2 { color: #2c3e50; border-bottom: 2px solid #3498db; padding-bottom: 8px; margin-top: 30px; } .irr-article h3 { color: #2980b9; margin-top: 20px; } .irr-table { width: 100%; border-collapse: collapse; margin: 20px 0; } .irr-table th, .irr-table td { border: 1px solid #ddd; padding: 12px; text-align: center; } .irr-table th { background-color: #f2f2f2; }

Inter-Rater Reliability Calculator

Calculate Cohen's Kappa for categorical agreement between two raters.

Both Agree: Category A (Yes/Pass)

Rater 1: Yes | Rater 2: No

Rater 1: No | Rater 2: Yes

Both Agree: Category B (No/Fail)

Total Observations (N): –

Observed Agreement (Po): –

Expected Agreement (Pe): –

Cohen's Kappa (κ): –

Understanding Inter-Rater Reliability

Inter-rater reliability (IRR) is a statistical measure that quantifies the degree of agreement between different observers or raters who are assessing the same phenomenon. While "percent agreement" is a simple way to look at consistency, it can be misleading because it doesn't account for the agreement that might happen purely by chance.

What is Cohen's Kappa?

Cohen's Kappa (κ) is the gold standard for measuring inter-rater reliability for categorical data. It adjusts the observed agreement by subtracting the probability of random agreement. This provides a more robust indicator of how much the raters actually agree on the criteria being measured.

The Cohen's Kappa Formula

The formula for calculating Kappa is:

κ = (p_o – p_e) / (1 – p_e)

p_o: Relative observed agreement among raters.
p_e: Hypothetical probability of chance agreement.

Interpreting Your Results

According to the widely accepted Landis and Koch (1977) scale, Kappa values can be interpreted as follows:

Kappa Value	Strength of Agreement
< 0.00	Poor (Less than chance)
0.00 – 0.20	Slight
0.21 – 0.40	Fair
0.41 – 0.60	Moderate
0.61 – 0.80	Substantial
0.81 – 1.00	Almost Perfect

Calculation Example

Imagine two doctors diagnosing 100 X-rays as "Normal" or "Abnormal".

Both agree "Normal": 70 cases
Both agree "Abnormal": 15 cases
Dr. A says Normal, Dr. B says Abnormal: 10 cases
Dr. A says Abnormal, Dr. B says Normal: 5 cases

In this scenario, the observed agreement is 85%. However, after calculating the expected chance agreement, the Cohen's Kappa would provide the true level of diagnostic consistency between the two physicians.

Inter Rater Reliability Calculation