Ensemble (요약)

Ensemble (요약)

- 6월 20, 2021

1. Ensemble Learning

Set of models is integrated in some way to obtain the final prediction
Homogeneous : It uses only one induction algorithm

SVM1 + SVM2 + SVM3 ...

Heterogeneous : It uses different induction algorithms

SVM + DT + DNN + Bayesian ...

Adds complexity
Violation of Occam's Razor

Decision boundary may become simpler

Data manipulation : Changes the training set in order to obtain different models

Manipulating the input features
Sub-sampling from the training set

Modeling process manipulation : Changes the induction algorithm

Manipulating the parameter sets
Manipulating the induction algorithm

Combine Models

Algebraic method : Average, Weighted Average, etc.
Voting method : Majority Voting, Weighted Majority Voting, etc.

Base Models : The base classifiers should be as accurate as possible and having diverse errors, while each classifier provides some positive evidences

2. Bagging (Bootstrap AGGregatING)

Averaging the prediction over a collection of predictors generated from bootstrap samples

For noisy data: not considerably worse, more robust
High Variance : need unstable classifier types
Decision trees are a typical unstable classifier $\rightarrow$ Random Forest

3. Boosting (AdaBoost : Adaptive Boosting)

Weighted vote with a collection of classifiers that were trained sequentially from training sets given priority to instances wrongly classified
Focus on difficult examples which are not correctly classified in the previous steps
Using Different Data Distribution

Start with uniform weighting
During each step of learning

Not correctly learned by the weak learner $\rightarrow$ Increase weights
Correctly learned by the weak learner $\rightarrow$ Decrease weights
(Weight는 상대적인 값임으로 두 방법 중 하나만 써도 됨)

Risks overfitting the model to misclassified data $\rightarrow$ Use weighted sum/vote

4. Random Forest

A variation of the bagging algorithm
Classification : each tree votes and the most popular class is returned
Regression : the result is the averaged prediction of all generated trees

Construct Random Forest

Forest-RI (random input selection) : randomly select attributes as candidates for the split at the node.
Forest-RC (random linear combinations): new attributes that are a linear combination of the existing attributes (reduces the correlation between individual classifiers)

Faster than bagging or boosting

5. Statistical Validation

Mixture of Experts : Combine votes or scores
Stacking : Combiner $f()$ is another learner
Cascading : Use next level of classifier if the previous decision is not confident enough

댓글