Dimensionality Reduction (요약)

 1. Feature extraction / Dimensionality reduction
  • Given data points in $d$ dimensions
  • Convert them to data points in $k(<d)$  dimensions
  • With minimal loss of information

2. Principal Component Analysis (PCA)
  • Find k-dim projection that best preserves variance
  • Process
    1. compute mean vector $\mu$ and covariance matrix $\Sigma$ of original data $$\Sigma = \frac{1}{N}\sum_{i=1}^N (x_i - \mu)(x_i-\mu)^T $$
    2. Compute eigenvectors and eigenvalues of $\Sigma$ $$\Sigma v = \lambda v $$
    3. Select largest $k$ eigenvalues (also eigenvectors)
    4. Project points onto subspace spanned by them $$y = A(x - \mu)$$
  • Eigenvector with largest eigenvalue captures the most variation among data $X$
  • We can compress the data by using the top few eigenvectors (principal components)
  • Feature vector $y_k$'s are uncorrelated (orthogonal)

3. Linear Discriminant Analysis (LDA)
  • PCA vs. LDA
    • PCA does not consider "class" information
    • LDA consider "class" information
    • PCA maximizes projected total scatter
    • LDA maximizes ratio of projected between-class to projected within-class scatter
  • Within-class scatter (want to minimize) $$\Sigma_w = \sum_{j=1}^c \frac{1}{N_c}\sum_{i=1}^{N_c} (x_i -\mu_c)(x_i -\mu_c)^T $$
  • Between-class scatter (want to maximize) $$\Sigma_b = \frac{1}{c}\sum_{i=1}^c (\mu_i -\mu)(\mu_i -\mu)^T $$
  • Compute eigenvectors and eigenvalues $$\frac{\Sigma_b}{\Sigma_c}v = \lambda v $$















댓글

이 블로그의 인기 게시물

One-Class SVM & SVDD

Support Vector Regression (SVR)

Self-Training & Co-Training