Dimensionality Reduction (요약)

Dimensionality Reduction (요약)

- 6월 20, 2021

1. Feature extraction / Dimensionality reduction

Given data points in $d$ dimensions
Convert them to data points in $k(<d)$ dimensions
With minimal loss of information

2. Principal Component Analysis (PCA)

Find k-dim projection that best preserves variance
Process

compute mean vector $\mu$ and covariance matrix $\Sigma$ of original data $$\Sigma = \frac{1}{N}\sum_{i=1}^N (x_i - \mu)(x_i-\mu)^T $$
Compute eigenvectors and eigenvalues of $\Sigma$ $$\Sigma v = \lambda v $$
Select largest $k$ eigenvalues (also eigenvectors)
Project points onto subspace spanned by them $$y = A(x - \mu)$$

Eigenvector with largest eigenvalue captures the most variation among data $X$
We can compress the data by using the top few eigenvectors (principal components)

Feature vector $y_k$'s are uncorrelated (orthogonal)

3. Linear Discriminant Analysis (LDA)

PCA vs. LDA

PCA does not consider "class" information
LDA consider "class" information
PCA maximizes projected total scatter
LDA maximizes ratio of projected between-class to projected within-class scatter

Within-class scatter (want to minimize) $$\Sigma_w = \sum_{j=1}^c \frac{1}{N_c}\sum_{i=1}^{N_c} (x_i -\mu_c)(x_i -\mu_c)^T $$
Between-class scatter (want to maximize) $$\Sigma_b = \frac{1}{c}\sum_{i=1}^c (\mu_i -\mu)(\mu_i -\mu)^T $$
Compute eigenvectors and eigenvalues $$\frac{\Sigma_b}{\Sigma_c}v = \lambda v $$

댓글