Machine Learning

Supervised Learning: Regression / Regression Models


Regression Models

Regression is a supervised learning technique used to predict a continuous numerical output variable (dependent variable, y) based on input features (independent variables, x). It establishes a mathematical function mapping inputs to the output.

1. Simple Linear Regression (SLR)

Models the relationship between exactly one independent variable and one dependent variable using a straight line.

y = β₀ + β₁x + ε
  • y: Dependent variable (target to predict)
  • x: Independent variable (feature)
  • β₀: Y-intercept (value of y when x = 0)
  • β₁: Slope (how much y changes for a 1-unit change in x)
  • ε: Error term (residuals)

2. Multiple Linear Regression (MLR)

Extends simple linear regression to model the relationship between two or more independent variables and a single dependent variable.

y = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ + ε
Matrix Representation & OLS Estimation

For m data points with n features, we define Y = Xβ + ε. The Ordinary Least Squares (OLS) method minimizes the Sum of Squared Residuals to find the optimal coefficients:

Normal Equation: β = (XᵀX)⁻¹ Xᵀ Y
5 Key Assumptions (LINED)
  • Linearity: Relationship between x and y is linear.
  • Independence: Observations are independent of each other.
  • Normality: Residuals (errors) are normally distributed.
  • Equal Variance (Homoscedasticity): Residuals have constant variance.
  • No Dependency (No Multicollinearity): Independent variables are not highly correlated.

3. Polynomial Regression

Linear regression fails when the true relationship is non-linear (curved). Polynomial regression extends MLR by adding polynomial (higher-power) features to fit these non-linear patterns.

y = β₀ + β₁x + β₂x² + β₃x³ + ... + βₙxⁿ + ε

Key Insight: Although the curve is non-linear with respect to x, the model is still considered linear in its parameters (β). Therefore, we can substitute z₁=x, z₂=x² and solve it exactly like Multiple Linear Regression!

The Overfitting Danger

Choosing the correct polynomial degree is a classic Bias-Variance Tradeoff:

  • Low degree (e.g., n=1): High Bias (Underfitting) - model is too simple.
  • Optimal degree: Captures the true underlying pattern.
  • High degree (e.g., n=15): High Variance (Overfitting) - model fits the noise perfectly but fails on new data.

Solutions to Overfitting: Use Regularization (Ridge/LASSO) or Cross-Validation.

Ready to test your Regression Models knowledge?

Regression Models

Test your knowledge on Simple, Multiple, and Polynomial Regression, including assumptions and OLS estimation.

5 questions·No time limit·Instant feedback