Linear Regression Calculator
Enter your data points below to find the line of best fit (y = mx + b).
Regression Equation:
Correlation Coefficient (r): –
How to Calculate Linear Regression: A Comprehensive Guide
Linear regression is one of the most fundamental statistical methods used in data science and analytics to model the relationship between two variables. It assumes a linear relationship between the independent variable (X) and the dependent variable (Y).
The Simple Linear Regression Formula
The goal of simple linear regression is to find the "line of best fit" represented by the equation:
y = mx + b
- y: The predicted value (Dependent variable)
- x: The input value (Independent variable)
- m: The slope of the line (Regression coefficient)
- b: The y-intercept (Where the line crosses the y-axis)
Step-by-Step Manual Calculation
To calculate the slope (m) and the intercept (b) using the Ordinary Least Squares (OLS) method, follow these steps:
- Find the Mean: Calculate the average of all X values and all Y values.
- Calculate Deviations: For each point, find (x – meanX) and (y – meanY).
- Calculate the Slope (m):
m = Σ((x – meanX) * (y – meanY)) / Σ((x – meanX)²) - Calculate the Intercept (b):
b = meanY – (m * meanX)
Example of Linear Regression
Imagine we want to predict a student's score based on the hours they studied:
| Hours Studied (X) | Test Score (Y) |
|---|---|
| 1 | 50 |
| 2 | 60 |
| 3 | 70 |
In this case, the relationship is perfectly linear. Using the calculator, we find that m = 10 and b = 40. The equation is y = 10x + 40. If a student studies for 4 hours, the predicted score would be 10(4) + 40 = 80.
What does the Correlation Coefficient (r) mean?
The correlation coefficient measures the strength and direction of the linear relationship between the variables. Its value ranges from -1 to 1:
- 1: Perfect positive correlation.
- 0: No linear relationship.
- -1: Perfect negative correlation.