Machine Learning
Supervised Learning: Classification / Logistic Regression
2. Logistic Regression
Logistic Regression is a statistical method used for binary classification despite its name containing "regression". It models the probability that an input belongs to a particular class using the logistic (sigmoid) function.
The Sigmoid Function
The sigmoid function maps any real-valued number into a value between 0 and 1:
Where z = w₀ + w₁x₁ + w₂x₂ + ... + wₙxₙ (the linear combination of features).
Decision Boundary
Logistic regression outputs a probability P(y=1|x). A threshold (usually 0.5) is applied:
- If P(y=1|x) ≥ 0.5 → Predict class 1
- If P(y=1|x) < 0.5 → Predict class 0
Cost Function & Training
The cost function used is Log Loss (Binary Cross-Entropy):
Weights are updated using gradient descent to minimize the cost function:
Gradient Descent Algorithm Steps:
- Initialize weights: Start with random weights or zeros for w and bias b.
- Forward Pass: Compute the linear combination z = w·x + b and pass it through the sigmoid function to get predictions ŷ.
- Compute Cost: Calculate the Log Loss J(w) to see how far off predictions are from actual labels.
- Backward Pass (Gradients): Compute the partial derivatives of the cost function with respect to each weight: ∂J/∂wⱼ = (1/m) Σ (ŷ - y)·xⱼ.
- Update Weights: Adjust the weights in the opposite direction of the gradient: wⱼ = wⱼ - α(∂J/∂wⱼ), where α is the learning rate.
- Repeat: Iterate steps 2-5 until convergence (cost stops decreasing).
Multinomial Logistic Regression
For multi-class classification (more than 2 classes), the softmax function is used instead of sigmoid: