TFT

Linear Regression Calculator – Find Best Fit Line Online

Perform linear regression analysis on any dataset with our free online linear regression calculator. Get the regression equation, slope, intercept, and R² value with a scatter plot.

Examples:

Understanding Linear Regression

Linear regression finds the straight line that best fits a set of data points. It's one of the most widely used statistical techniques – from predicting house prices based on square footage to estimating sales from advertising spend.

The "best fit" line minimizes the sum of squared vertical distances (residuals) between the actual data points and the predicted values on the line. This method, called least squares regression, was developed independently by Gauss and Legendre in the early 1800s.

The Regression Formula

The best-fit line has the equation y = mx + b (or y = a + bx), where:

Slope (m): m = SSxy / SSxx
where SSxy = Σ(x - x̄)(y - ȳ) and SSxx = Σ(x - x̄)²
Intercept (b): b = ȳ - m(x̄)
The line always passes through the point (x̄, ȳ)
R² (coefficient of determination): R² = SSR / SST
Proportion of variance in Y explained by X

Worked Examples

Example 1: Study Hours vs Test Scores

Data: Hours studied (X) and test scores (Y)

X: 1, 2, 3, 4, 5 hours
Y: 60, 70, 75, 85, 90 points
Calculations:
x̄ = 3, ȳ = 76
Slope = 7.5 (each hour adds ~7.5 points)
Intercept = 53.5
R² = 0.97 (97% of score variation explained)
Equation: Score = 7.5 × Hours + 53.5

Example 2: Temperature vs Ice Cream Sales

Daily temperature and ice cream revenue

X: 70, 75, 80, 85, 90, 95°F
Y: $200, $250, $310, $380, $450, $520
Results:
Slope = 12.8 (each degree adds ~$12.80)
Intercept = -696 (not meaningful – extrapolation)
R² = 0.99 (excellent fit)
Equation: Sales = 12.8 × Temp - 696

Example 3: Age vs Height (Children)

Age in years and height in cm for ages 5-10

X: 5, 6, 7, 8, 9, 10 years
Y: 110, 117, 124, 130, 136, 142 cm
Results:
Slope = 6.4 cm/year (typical growth rate)
Intercept = 78 cm (estimated height at birth)
R² = 0.995 (very strong relationship)
Equation: Height = 6.4 × Age + 78

Quick Fact

Sir Francis Galton coined the term "regression" in 1886 when studying heredity. He noticed that exceptionally tall parents tended to have children who were shorter than them (regressing toward the average). The statistical technique we use today grew from his biological observations.

Frequently Asked Questions

What does R² tell me?

R² (coefficient of determination) shows what percentage of variation in Y is explained by X. R² = 0.80 means 80% of the variation is accounted for by the linear relationship. Values range from 0 (no relationship) to 1 (perfect fit).

What's the difference between r and R²?

r (correlation coefficient) measures the strength and direction of the linear relationship (-1 to +1). R² is r squared and represents the proportion of variance explained. If r = 0.8, then R² = 0.64 (64% variance explained).

Can I use regression for prediction?

Yes, but be careful. Predictions work best within the range of your data (interpolation). Predicting outside that range (extrapolation) is risky – the relationship might change. Also, correlation doesn't prove causation.

What if my data isn't linear?

Linear regression assumes a straight-line relationship. If your data curves, consider polynomial regression, logarithmic transformation, or other non-linear methods. Always plot your data first to check the pattern.

What are residuals and why do they matter?

Residuals are the vertical distances between actual data points and the regression line. Analyzing residuals helps you check if linear regression is appropriate – they should be randomly scattered, not showing a pattern.

How many data points do I need?

Minimum is 2 points (but that gives a perfect fit regardless). For meaningful analysis, aim for at least 10-20 points. More data provides more reliable estimates and better detection of outliers.

Other Free Tools

Correlation Coefficient Calculator – Find Pearson r Online

Calculate the Pearson correlation coefficient between two variables with our free online calculator. Measure the strength and direction of linear relationships in your data.

Scatter Plot Generator – Create Scatter Plots Online Free

Generate scatter plots from any two-variable dataset with our free online scatter plot generator. Visualize data relationships, trends, and correlations instantly.

Slope Calculator – Find the Slope of a Line Online

Calculate the slope or gradient of any line using two points or a linear equation with our free online slope calculator. Find slope, intercepts, and line equations easily.

Standard Deviation Calculator – Variance & SD Online

Calculate standard deviation and variance for any dataset with our free online calculator. Supports both population and sample standard deviation with step-by-step workings.

Z-Score Calculator – Find Standard Score Online

Calculate the Z-score of any data point with our free online Z-score calculator. Enter the value, mean, and standard deviation to get the standardized score instantly.

Normal Distribution Calculator – Find Probability & Percentile

Calculate probabilities and percentiles for a normal distribution with our free online calculator. Input mean and standard deviation to find area under the bell curve.

Confidence Interval Calculator – Find CI for Mean Online

Calculate confidence intervals for population means with our free online confidence interval calculator. Supports 90%, 95%, and 99% confidence levels with margin of error shown.

Mean, Median, Mode Calculator – Statistics Calculator Online

Calculate mean, median, and mode of any dataset with our free online statistics calculator. Enter your numbers and get comprehensive central tendency measures instantly.