False Discovery Rate (FDR) Calculator
Analysis Results
How to Calculate False Discovery Rate
The False Discovery Rate (FDR) is a statistical method used in multiple hypothesis testing to correct for the problem of multiple comparisons. When you perform hundreds or thousands of tests (like in genomics or large-scale A/B testing), a standard p-value of 0.05 will naturally result in many "false positives" just by chance.
The FDR Formula
In its simplest estimation, the False Discovery Rate is calculated as the ratio of expected false positives to the total number of rejected null hypotheses (significant results):
FDR = (m * α) / R
- m: Total number of hypotheses tested.
- α: The significance threshold (p-value) used.
- R: The total number of results that were actually declared significant.
Why Use FDR Instead of P-Values?
If you run 1,000 tests with a p-value threshold of 0.05, you expect 50 significant results to appear purely by random chance (1,000 * 0.05). If your experiment actually produces 60 "significant" results, the FDR tells you that 50 of those 60 are likely noise, meaning your FDR is roughly 83%. This indicates your findings are not very reliable.
Practical Example:
Imagine a scientist testing 500 different genes to see if they relate to a disease:
- Total Tests (m): 500
- Alpha Level (α): 0.01
- Significant Results Found (R): 20
Calculation:
Expected False Positives = 500 * 0.01 = 5
FDR = 5 / 20 = 0.25 (or 25%)
This means that while 20 genes look promising, approximately 25% of them (5 genes) are likely false discoveries.
Controlling FDR with Benjamini-Hochberg
Modern researchers often use the Benjamini-Hochberg (BH) procedure. Instead of calculating FDR after the fact, they set a target FDR (e.g., 0.05) and adjust their p-value thresholds dynamically based on the rank of each result. This allows for a balance between discovering real effects and limiting false alarms.