以前在看用RF和GBDT的时候,以为是很相似的两个算法,都是属于集成算法,但是仔细研究之后,发现他们根本完全不同。下面总结主要的一些不同点
Random Forest:
bagging (你懂得,原本叫Bootstrap aggregating)
Recall that the key to bagging is that trees are repeatedly fit to bootstrapped subsets of the observations. One can show that on average, each bagged treemakes use of around two-thirds of the observations.
bagging 的关键是重复的对经过bootstrapped采样来的观测集子集进行拟合。然后求平均。。。一个bagged tree充分利用近2/3的样本集。。。所以就有了OOB预估(outof bag estimation)
training: bootstrap the sampl