Ensemble (요약)
1. Ensemble Learning
- Set of models is integrated in some way to obtain the final prediction
- Homogeneous : It uses only one induction algorithm
- SVM1 + SVM2 + SVM3 ...
- Heterogeneous : It uses different induction algorithms
- SVM + DT + DNN + Bayesian ...
- Adds complexity
- Violation of Occam's Razor
- Decision boundary may become simpler
- Data manipulation : Changes the training set in order to obtain different models
- Manipulating the input features
- Sub-sampling from the training set
- Modeling process manipulation : Changes the induction algorithm
- Manipulating the parameter sets
- Manipulating the induction algorithm
- Combine Models
- Algebraic method : Average, Weighted Average, etc.
- Voting method : Majority Voting, Weighted Majority Voting, etc.
- Base Models : The base classifiers should be as accurate as possible and having diverse errors, while each classifier provides some positive evidences
2. Bagging (Bootstrap AGGregatING)
3. Boosting (AdaBoost : Adaptive Boosting)
- Weighted vote with a collection of classifiers that were trained sequentially from training sets given priority to instances wrongly classified
- Focus on difficult examples which are not correctly classified in the previous steps
- Using Different Data Distribution
- Start with uniform weighting
- During each step of learning
- Not correctly learned by the weak learner $\rightarrow$ Increase weights
- Correctly learned by the weak learner $\rightarrow$ Decrease weights
- (Weight는 상대적인 값임으로 두 방법 중 하나만 써도 됨)
- Risks overfitting the model to misclassified data $\rightarrow$ Use weighted sum/vote
4. Random Forest
- A variation of the bagging algorithm
- Classification : each tree votes and the most popular class is returned
- Regression : the result is the averaged prediction of all generated trees
- Construct Random Forest
- Forest-RI (random input selection) : randomly select attributes as candidates for the split at the node.
- Forest-RC (random linear combinations): new attributes that are a linear combination of the existing attributes (reduces the correlation between individual classifiers)
- Faster than bagging or boosting
5. Statistical Validation
- Mixture of Experts : Combine votes or scores
- Stacking : Combiner $f()$ is another learner
- Cascading : Use next level of classifier if the previous decision is not confident enough
댓글
댓글 쓰기