Machine Learning

Supervised Learning: Regression / Advanced Dim Reduction (LDA, t-SNE)


Advanced Dimensionality Reduction Techniques

Beyond Principal Component Analysis (PCA), several other techniques are used depending on whether the task is supervised, non-linear, or requires neural networks.

1. Linear Discriminant Analysis (LDA)

LDA is a supervised dimensionality reduction technique. Unlike PCA, which focuses on maximum variance regardless of labels, LDA specifically seeks to project data in a way that maximizes the distance between different classes.

  • Supervised: Uses class labels to find optimal axes.
  • Between-Class Variance: Maximizes the distance between the means of different classes.
  • Within-Class Variance: Minimizes the spread of data points within each individual class.

2. t-SNE (t-Distributed Stochastic Neighbor Embedding)

t-SNE is a non-linear dimensionality reduction technique particularly well suited for the visualization of high-dimensional datasets in 2D or 3D.

  • Pairwise Similarities: It converts Euclidean distances between high-dimensional data points into joint probabilities that represent similarities.
  • KL Divergence: It optimizes the low-dimensional embedding by minimizing the Kullback-Leibler (KL) divergence between the high-dimensional and low-dimensional probability distributions.
  • Crowding Problem: It uses a Student t-distribution in the low-dimensional space to solve the "crowding problem", allowing for better cluster separation in visualization.
Note: t-SNE is primarily for visualization and exploration; it is not typically used for feature engineering or general preprocessing as it doesn't preserve global structure well.

3. Autoencoders (Neural Networks)

Autoencoders are a type of unsupervised neural network that learns how to efficiently compress and encode data then learn how to reconstruct the data back from the reduced representation.

  • Encoder: The part of the network that compresses the input into a latent-space representation (dimensionality reduction).
  • Bottleneck Layer: The layer with the fewest neurons, representing the reduced features.
  • Decoder: The part of the network that attempts to reconstruct the original input from the bottleneck representation.

Ready to test your Advanced Dim Reduction knowledge?

Advanced Dim Reduction

Explore advanced dimensionality reduction techniques including LDA, t-SNE, and Autoencoders.

5 questions·No time limit·Instant feedback