查看“︁集成学习”︁的源代码

{{noteTA
|1=zh-cn:贝叶斯;zh-tw:貝式
|2=zh-cn:阿尔茨海默氏症;zh-tw:阿茲海默症
|3=zh-cn:最优;zh-tw:最佳
|4=zh-cn:遥感;zh-tw:遙測
}}
{{refimprove|time=2019-01-02T20:56:01+00:00}}
{{roughtranslation|time=2019-01-02T20:56:01+00:00}}
在[[统计学]]和[[机器学习]]中，'''集成学习'''（{{lang-en|Ensemble learning}}）方法通过组合多种学习算法来获得比单独使用任何一种算法更好的[[预测]]性能。<ref>{{Cite journal|title=Popular ensemble methods: An empirical study|last=Opitz|first=D.|last2=Maclin|first2=R.|journal=[[Journal of Artificial Intelligence Research]]|doi=10.1613/jair.614|year=1999|volume=11|pages=169&ndash;198}}</ref><ref>{{Cite journal|title=Ensemble based systems in decision making|last=Polikar|first=R.|journal=IEEE Circuits and Systems Magazine|issue=3|doi=10.1109/MCAS.2006.1688199|year=2006|volume=6|pages=21&ndash;45}}</ref><ref name="Rokach2010">{{Cite journal|title=Ensemble-based classifiers|last=Rokach|first=L.|journal=Artificial Intelligence Review|issue=1-2|doi=10.1007/s10462-009-9124-7|year=2010|volume=33|pages=1&ndash;39}}</ref>与统计力学中通常是无限的[[系综]]不同，机器学习中的集成学习由有限的一组模型组成，但这些模型之间通常允许存在更灵活的结构。

== 概述 ==
[[监督学习]]算法通常被描述为在假设空间中搜索，以找到一个能够对特定问题做出良好预测的假设。即使假设空间包含非常适合特定问题的假设，找到一个好的假设也可能很困难。集成学习结合多个假设，形成一个（希望）更好的假设。术语集成通常保留用于使用相同基础学习器生成多个假设的方法。多分类器系统的更广泛术语还包括由非相同基础学习器得到的假设的结合。这种方法和现象也被另一个术语“群智”所描述，该术语来自多个DREAM生物医学数据科学挑战。

评估集成学习的预测通常需要比评估单个模型的预测花费更多的计算，因此集成可以被认为是通过执行大量额外计算来补偿偏差的学习算法的方式。诸如[[决策树]]之类的快速算法通常用于集合方法（如[[随机森林]]），尽管较慢的算法也可以从集成方法中受益。

通过类比，集成技术也已用于[[無監督式學習網路|无监督学习]]场景中，如共识聚类或[[异常检测]]。

== 集成理论 ==
集成学习本身是一种监督学习算法，因为它可以被训练然后用于进行预测。因此，训练后的集成模型代表了一个假设，但这个假设不一定被包含在构建它的模型的假设空间内。因此，可以证明集成学习在它们可以表示的功能方面具有更大的灵活性。理论上，这种灵活性使他们能够比单一模型更多地[[过拟合]]训练数据，但在实践中，一些集成算法（如[[Bagging算法]]）倾向于减少对训练数据过拟合相关的问题。

根据经验，当模型之间存在显著差异时，集成往往会产生更好的结果。<ref><div>Kuncheva,L.和Whitaker,C.,措施的多样性中的分类器的合奏， ''学习机'',51,pp.181-207，2003年</div></ref><ref><div>Sollich,P.和克罗,A., ''学习合唱团：如何过拟合可能是有用的''，先进的神经信息处理系统，第8卷，pp.190-196之，1996年。</div></ref>因此，许多集成方法试图促进它们组合的模型之间的多样性。<ref><div>Brown,G.和Wyatt,J.和Harris,R.和Yao，X.、多样化创作方法：调查和分类., ''信息的融合''，6个(1)，第5-20，2005年。</div></ref><ref><div>''[http://www.clei.cl/cleiej/papers/v8i2p1.pdf 准确性和多样性乐团的文Categorisers] {{Wayback|url=http://www.clei.cl/cleiej/papers/v8i2p1.pdf |date=20110707010133 }}''的。 J.J.García Adeva,Ulises Cerviño，R.Calvo,CLEI Journal,Vol. 8,No.2,pp.1-12月，2005年。</div></ref>尽管可能不是直观的，更随机的算法（如随机决策树）可用于产生比非常有意识的算法（如熵减少决策树）更强大的集成模型。<ref><div>嗬,T.,随机决定的森林， ''诉讼程序的第三次国际会议文件的分析和认识''，pp.278-282，1995年。</div></ref>然而，使用各种强大的学习算法已被证明是比使用试图愚弄模型以促进多样性的技术更有效。<ref><div>Gashler,M.和吉罗的载体，C.和Martinez,T., ''[http://axon.cs.byu.edu/papers/gashler2008icmla.pdf 决定树团：小型异构是更好的比较大的均匀] {{Wayback|url=http://axon.cs.byu.edu/papers/gashler2008icmla.pdf |date=20111006212622 }}''，第七次国际会议上学习机和应用程序，2008年，pp.900-905., [http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=4796917 DOI10.1109/ICMLA的。2008年。154]</div></ref>

== 集成模型大小 ==
虽然集成中的组成分类器的数量对预测的准确性具有很大影响，但是解决该问题的研究数量有限。先验地确定集成模型的大小以及大数据流的体积和速度使得这对于在线集成分类器来说更加重要，其中大多数统计测试被用于确定适当数量的组件。最近，理论框架表明对于集成模型存在理想数量的分类器，具有多于或少于该数量的分类器将使精度变差，这被称为“集成构建效果递减规律”。理论框架表明，使用与类标签数相同的独立分类器可以达到最高的准确度。<ref>{{cite conference |url=http://dl.acm.org/citation.cfm?id=2983907 |title=A Theoretical Framework on the Ideal Number of Classifiers for Online Ensembles in Data Streams |last1=R. Bonab |first1=Hamed |last2=Can |first2=Fazli |date=2016 |publisher=ACM |pages=2053 |location=USA |conference=CIKM |access-date=2019-01-01 |archive-date=2018-12-01 |archive-url=https://web.archive.org/web/20181201080722/https://dl.acm.org/citation.cfm?id=2983907 |dead-url=no }}</ref> <ref>{{cite conference |url=https://arxiv.org/pdf/1709.02925.pdf |title=Less Is More: A Comprehensive Framework for the Number of Components of Ensemble Classifiers |last1=R. Bonab |first1=Hamed |last2=Can |first2=Fazli |date=2017 |publisher=IEEE |location=USA |conference=TNNLS |access-date=2019-01-01 |archive-date=2019-06-09 |archive-url=https://web.archive.org/web/20190609081916/https://arxiv.org/pdf/1709.02925.pdf |dead-url=no }}</ref>

== 常见的集成类型 ==

=== 贝叶斯最优分类器 ===
贝叶斯最优分类器是一种分类技术，它是假设空间中所有假设的集合。平均而言，没有其他集成模型可以超越它。<ref><div>[[Tom M. Mitchell|汤姆M.Mitchell]], ''机学习''，1997年，第175</div></ref>朴素贝叶斯最优分类器假定数据在类上有条件地独立并使计算更可行，如果该假设为真，则对每个假设进行投票，该投票与从系统采样训练数据集的可能性成比例。为了促进有限大小的训练数据，每个假设的投票也乘以该假设的先验概率。贝叶斯最优分类器可以用以下等式表示：

: <math>y=\underset{c_j \in C}{\mathrm{argmax}} \sum_{h_i \in H}{P(c_j|h_i)P(T|h_i)P(h_i)}</math>

其中 <math>y</math>是预测标签， <math>C</math>是所有可能类的集合， <math>H</math>是假设空间， <math>P</math>是''概率''， <math>T</math>是训练数据。 作为集成，贝叶斯最优分类器表示不一定在<math>H</math>中的假设。然而，由贝叶斯最优分类器表示的假设是集合空间中的最优假设（所有可能的集合的空间仅由<math>H</math>中的假设组成）。

这个公式可以用[[贝叶斯定理]]重新表述，贝叶斯定理表明后验与先验的可能性成正比：

: <math>P(h_i|T) \propto P(T|h_i)P(h_i)</math>

因此， 

: <math>y=\underset{c_j \in C}{\mathrm{argmax}} \sum_{h_i \in H}{P(c_j|h_i)P(h_i|T)}</math>

=== Bootstrap聚合（Bagging） ===
Bootstrap聚合（Bootstrap Aggregating，[[Bagging算法|Bagging]]）使集成模型中的每个模型在投票时具有相同的权重。為了降低不穩定過程如樹的方差，Bagging對B個(比如說，B使用與類標籤數相同的數量)bootstrap datasets上的模型求平均，從而降低其方差並導致預測性能的改善。例如，[[随机森林|随机森林算法]]将随机决策树与Bagging相结合，以实现更高的分类准确度。<ref><div>Breiman，L.，装袋预测， ''学习机'',24(2)，pp.123-140，1996年。</div></ref>

=== Boosting ===
[[Boosting (meta-algorithm)|Boosting]]通过在训练新模型实例时更注重先前模型错误分类的实例来增量构建集成模型。在某些情况下，Boosting已被证明比Bagging可以得到更好的准确率，不过它也更倾向于对训练数据过拟合。目前比较常见的增强实现有[[AdaBoost]]等算法。

=== 贝叶斯参数平均 ===
贝叶斯参数平均（Bayesian Parameter Averaging，BPA）是一种集成方法，它试图通过对假设空间中的假设进行抽样来近似贝叶斯最优分类器，并使用贝叶斯定律将它们组合起来。<ref>{{Cite journal|title=Bayesian Model Averaging: A Tutorial|url=https://archive.org/details/sim_statistical-science_1999-11_14_4/page/382|last=Hoeting|first=J. A.|last2=Madigan|first2=D.|journal=Statistical Science|issue=4|doi=10.2307/2676803|year=1999|volume=14|pages=382–401|jstor=2676803|pmc=|pmid=|last3=Raftery|first3=A. E.|last4=Volinsky|first4=C. T.}}</ref>与贝叶斯最优分类器不同，贝叶斯模型平均（Bayesian Model Averaging，BMA）可以实际实现。通常使用诸如[[MCMC]]的[[蒙特卡罗方法]]对假设进行采样。例如，可以使用[[吉布斯采样]]来绘制代表分布<math>P(T|H)</math>的假设。已经证明，在某些情况下，当以这种方式绘制假设并根据贝叶斯定律求平均时，该算法具有预期误差，该误差被限制为贝叶斯最优分类器的预期误差的两倍。<ref><div>大卫Haussler，迈克尔*柯恩斯和罗伯特*E Schapire的。 ''边界在这样复杂的贝学习使用信息理论和VC尺寸''的。 学习机，14:83-113，1994年</div></ref>尽管这种技术理论正确，但早期工作中的实验结果表明，与简单的集成方法如Bagging相比，该方法促进了过拟合并且表现更差；<ref>{{cite conference
 |first=Pedro
 |last=Domingos
 |title=Bayesian averaging of classifiers and the overfitting problem
 |conference=Proceedings of the 17th [[International Conference on Machine Learning|International Conference on Machine Learning (ICML)]]
 |pages=223–-230
 |year=2000
 |url=http://www.cs.washington.edu/homes/pedrod/papers/mlc00b.pdf
 |access-date=2019-01-01
 |archive-date=2012-03-02
 |archive-url=https://web.archive.org/web/20120302223624/http://www.cs.washington.edu/homes/pedrod/papers/mlc00b.pdf
 |dead-url=no
 }}</ref> however, these conclusions appear to be based on a misunderstanding of the purpose of Bayesian model averaging vs. model combination.<ref>{{citation
 |first=Thomas
 |last=Minka
 |title=Bayesian model averaging is not model combination
 |year=2002
 |url=http://research.microsoft.com/en-us/um/people/minka/papers/minka-bma-isnt-mc.pdf
 |accessdate=2019-01-01
 |archive-date=2016-07-02
 |archive-url=https://web.archive.org/web/20160702075014/http://research.microsoft.com/en-us/um/people/minka/papers/minka-bma-isnt-mc.pdf
 |dead-url=no
 }}</ref>然而，这些结论似乎是基于对目的的误解贝叶斯模型平均与模型组合。<ref>{{Citation|last=Minka|first=Thomas|title=Bayesian model averaging is not model combination|url=http://research.microsoft.com/en-us/um/people/minka/papers/minka-bma-isnt-mc.pdf|year=2002|accessdate=2019-01-01|archive-date=2016-07-02|archive-url=https://web.archive.org/web/20160702075014/http://research.microsoft.com/en-us/um/people/minka/papers/minka-bma-isnt-mc.pdf|dead-url=no}}</ref>此外，BMA的理论和实践取得了相当大的进展，最近的严格证明证明了BMA在高维设置中变量选择和估计的准确性，<ref>{{Cite journal|title=Bayesian linear regression with sparse priors|last=Castillo|first=I.|last2=Schmidt-Hieber|first2=J.|journal=[[Annals of Statistics]]|issue=5|doi=10.1214/15-AOS1334|year=2015|volume=43|pages=1986–2018|arxiv=1403.0735|last3=van der Vaart|first3=A.}}</ref>并提供了实验证据，强调了BMA中的稀疏执行先验在缓解过拟合方面的作用。<ref>{{Cite journal|title=Generalized Spike-and-Slab Priors for Bayesian Group Feature Selection Using Expectation Propagation|url=http://www.jmlr.org/papers/volume14/hernandez-lobato13a/hernandez-lobato13a.pdf|last=Hernández-Lobato|first=D.|last2=Hernández-Lobato|first2=J. M.|journal=Journal of Machine Learning Research|issue=|doi=|year=2013|volume=14|pages=1891–1945|last3=Dupont|first3=P.|access-date=2019-01-01|archive-date=2019-07-28|archive-url=https://web.archive.org/web/20190728190509/http://jmlr.org/papers/volume14/hernandez-lobato13a/hernandez-lobato13a.pdf|dead-url=no}}</ref>

=== 贝叶斯模型组合 ===
贝叶斯模型组合（BMC）是对贝叶斯模型平均（BMA）的算法校正。 它不是单独对整体中的每个模型进行采样，而是从可能的集合空间中进行采样（模型权重从具有均匀参数的Dirichlet分布中随机抽取） 这种修改克服了BMA趋向于将所有权重赋予单个模型的趋势 尽管BMC在计算上比BMA更昂贵，但它往往会产生显着更好的结果 BMC的结果显示平均值优于（具有统计显着性）BMA和Bagging。<ref>{{cite conference
|author=Monteith, Kristine
|author2=Carroll, James
|author3=Seppi, Kevin
|author4=Martinez, Tony.
|url=http://axon.cs.byu.edu/papers/Kristine.ijcnn2011.pdf
|title=Turning Bayesian Model Averaging into Bayesian Model Combination
|conference=Proceedings of the International Joint Conference on Neural Networks IJCNN'11
|year=2011
|pages=2657&ndash;2663
|access-date=2019-01-01
|archive-date=2014-07-05
|archive-url=https://web.archive.org/web/20140705232035/http://axon.cs.byu.edu/papers/Kristine.ijcnn2011.pdf
|dead-url=no
}}</ref>

使用贝叶斯定律来计算模型权重需要计算给定每个模型的数据的概率，通常集成中的模型都不是生成训练数据的分布，因此对于该项，它们都正确地接收到接近于零的值。如果集成足够大以对整个模型空间进行采样，这将很有效，但这种情况很少发生。因此，训练数据中的每个模式将使集成权重朝向最接近训练数据分布的集合中的模型移动，这实质上减少了用于进行模型选择的不必要的复杂方法。

集成的可能权重可以看作是躺在单面上，在单形的每个顶点处，所有权重都被赋予集成中的单个模型。BMA会聚到最接近训练数据分布的顶点。相比之下，BMC汇聚到这种分布投射到单纯形态的点上。换句话说，它不是选择最接近生成分布的一个模型，而是寻找最接近生成分布的模型的组合。

BMA的结果通常可以通过使用交叉验证从一系列模型中选择最佳模型来近似。同样地，可以通过使用交叉验证来近似来自BMC的结果，以从可能的权重的随机采样中选择最佳的集成组合。

=== 桶模型 ===
“桶模型”（英语：bucket of models）是一种使用模型选择算法为每个问题选择最佳模型的集成方法。当仅使用一个问题进行测试时，一组模型不会产生比集成中的最佳模型更好的结果，但是当针对许多问题进行评估时，它通常会产生比集成中的任何模型更好的结果。

最常见的方法用于模型的选择是[[交叉驗證|交叉验证]]。它用以下伪代码描述：
 For each model m in the bucket:
   Do c times: (where 'c' is some constant)
     Randomly divide the training dataset into two datasets: A, and B.
     Train m with A
     Test m with B
 Select the model that obtains the highest average score
交叉验证选择可以概括为：“使用训练集尝试所有选择，并选择最有效的方法”。<ref><div>肯定Dzeroski，伯纳德善光，可 ''[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.6096 是相结合的分类更好于选择最好的一个] {{Wayback|url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.6096 |date=20090906081905 }}''机学,2004,pp.255--273</div></ref>

门控是交叉验证选择的一般化。它涉及训练另一种学习模型，以确定桶中哪些模型最适合解决问题。通常，[[感知器]]被应用于门控模型。它可用于选择“最佳”模型，或者可用于为桶中每个模型的预测提供线性权重。

当使用具有大量问题的桶模型时，可能希望避免需要花费很长时间训练的一些模型。地标学习是一种寻求解决这一问题的元学习方法，它涉及仅训练桶中的快速（但不精确）算法，然后使用这些算法的性能来帮助确定哪种慢（但准确）算法最有可能做得最好。<ref><div>Bensusan,Hilan和吉罗载Christophe G.,发现任务区通过里程碑式的学习表演，PKDD'00:诉讼程序的第4次欧洲会议关于原则的数据挖掘和知识发现，Springer-Verlag，2000年，第325页--330</div></ref>

=== Stacking ===
堆叠（英语：Stacking）（有时称为堆叠泛化）涉及训练学习算法以组合其他几种学习算法的预测。首先，使用可用数据训练所有其他算法，然后训练组合器算法以使用其他算法的所有预测作为附加输入进行最终预测。如果使用任意组合器算法，那么堆叠理论上可以表示本文中描述的任何集合技术，但实际上，通常用[[邏輯斯諦迴歸]]模型作为组合器。

Stacking通常比任何一个经过训练的模型都能产生更好的性能，<ref><div>沃伯特，D.， ''堆叠的概括。''的， 神经网络,5(2),pp.241-259., 1992年</div></ref>它已成功用于监督学习任务（如回归、<ref>Breiman, L., ''Stacked Regression'', Machine Learning, 24, 1996 {{Doi|10.1007/BF00117832}}</ref> 分类和距离学习 <ref>{{Cite journal|title=A New Fuzzy Stacked Generalization Technique and Analysis of its Performance|last=Ozay|first=M.|last2=Yarman Vural|first2=F. T.|year=2013|arxiv=1204.0171|bibcode=2012arXiv1204.0171O}}</ref>）和无监督学习（如密度估计）。<ref><div>史密斯，P.和沃伯特,D.H ''的直线相结合的密度估计通过叠''，机器
学习的期刊，36,59-83，1999年</div></ref> Stacking也被用于评估Bagging的错误率。<ref name="Rokach2010">{{Cite journal|title=Ensemble-based classifiers|last=Rokach|first=L.|journal=Artificial Intelligence Review|issue=1-2|doi=10.1007/s10462-009-9124-7|year=2010|volume=33|pages=1&ndash;39}}</ref><ref><div>沃伯特，D.，和麦克瑞德，W.G.， ''一个有效的方法来估计装袋的概括错误''、学习机杂志，35,41-55来，1999年</div></ref> 据报道，它的表现超过了贝叶斯模型的平均值。<ref><div>克拉克,B., ''Bayes模型平均和堆叠当的模式近似的错误不可忽视''，Journal of机学习的研究，pp683-712，2003年</div></ref>在Netflix竞赛中两个表现最好的人使用混合方法（英语：Blending），这可以被认为是一种Stacking形式。<ref>{{Cite journal|title=Feature-Weighted Linear Stacking|last=Sill|first=J.|last2=Takacs|first2=G.|year=2009|arxiv=0911.0460|bibcode=2009arXiv0911.0460S|last3=Mackey|first3=L.|last4=Lin|first4=D.}}</ref>

== 实现库 ==

* [[R语言|R]]：至少有三个软件包提供贝叶斯模型平均工具，<ref>{{Cite journal|title=Bayesian model averaging in R|url=https://core.ac.uk/download/pdf/6494889.pdf|last=Amini|first=Shahram M.|last2=Parmeter|first2=Christopher F.|journal=Journal of Economic and Social Measurement|issue=4|doi=|year=2011|volume=36|pages=253–287|access-date=2019-01-01|archive-date=2017-12-03|archive-url=https://web.archive.org/web/20171203173517/https://core.ac.uk/download/pdf/6494889.pdf|dead-url=no}}</ref>包括BMS（贝叶斯模型选择）包、<ref>{{Cite web|url=https://cran.r-project.org/web/packages/BMS/index.html|title=BMS: Bayesian Model Averaging Library|accessdate=September 9, 2016|work=The Comprehensive R Archive Network|archive-date=2020-11-28|archive-url=https://web.archive.org/web/20201128112749/https://cran.r-project.org/web/packages/BMS/index.html|dead-url=no}}</ref>BAS（贝叶斯自适应采样的首字母缩写）包、<ref>{{Cite web|url=https://cran.r-project.org/web/packages/BAS/index.html|title=BAS: Bayesian Model Averaging using Bayesian Adaptive Sampling|accessdate=September 9, 2016|work=The Comprehensive R Archive Network|archive-date=2020-10-07|archive-url=https://web.archive.org/web/20201007160436/https://cran.r-project.org/web/packages/BAS/index.html|dead-url=no}}</ref>和BMA包。<ref>{{Cite web|url=https://cran.r-project.org/web/packages/BMA/index.html|title=BMA: Bayesian Model Averaging|accessdate=September 9, 2016|work=The Comprehensive R Archive Network|archive-date=2021-05-07|archive-url=https://web.archive.org/web/20210507035034/https://cran.r-project.org/web/packages/BMA/index.html|dead-url=no}}</ref>H2O包提供了许多机器学习模型，包括一个集成模型，也可以使用[[Apache Spark|Spark]]进行训练。
* [[Python]]：[[Scikit-learn]]，一个用于Python机器学习的软件包，提供用于集成学习的软件包，包括用于Bagging和平均方法的软件包。
* [[MATLAB]]：分类集成在统计和机器学习工具箱中实现。<ref>{{Cite web|url=https://uk.mathworks.com/help/stats/classification-ensembles.html|title=Classification Ensembles|accessdate=June 8, 2017|work=MATLAB & Simulink|archive-date=2020-12-01|archive-url=https://web.archive.org/web/20201201124509/https://uk.mathworks.com/help/stats/classification-ensembles.html|dead-url=no}}</ref>

== 集成学习应用 ==
近年来，由于计算能力不断提高，允许在合理的时间范围内训练大型集成模型，其应用数量也越来越多。<ref name="s1">{{Cite journal|title=A survey of multiple classifier systems as hybrid systems|last=Woźniak|first=Michał|last2=Graña|first2=Manuel|date=March 2014|journal=Information Fusion|doi=10.1016/j.inffus.2013.04.006|volume=16|pages=3–17|last3=Corchado|first3=Emilio}}</ref>集成分类器的一些应用包括：

=== 遥感 ===

==== 土地覆盖测绘 ====
土地覆盖测绘是[[地球观测卫星]]传感器的主要应用之一，利用[[遥感]]和地理空间数据识别位于目标区域表面的材料和物体。一般来说，目标材料的类别包括道路、建筑物、河流、湖泊和植被。<ref name="rodriguez">{{Cite journal|title=An assessment of the effectiveness of a random forest classifier for land-cover classification|url=https://archive.org/details/sim_isprs-journal-of-photogrammetry-and-remote-sensing_2012-01_67/page/93|last=Rodriguez-Galiano|first=V.F.|last2=Ghimire|first2=B.|date=January 2012|journal=ISPRS Journal of Photogrammetry and Remote Sensing|doi=10.1016/j.isprsjprs.2011.11.002|volume=67|pages=93–104|bibcode=2012JPRS...67...93R|last3=Rogan|first3=J.|last4=Chica-Olmo|first4=M.|last5=Rigol-Sanchez|first5=J.P.}}</ref>基于[[人工神经网络]]<ref>{{Cite journal|title=Design of effective neural network ensembles for image classification purposes|last=Giacinto|first=Giorgio|last2=Roli|first2=Fabio|date=August 2001|journal=Image and Vision Computing|issue=9-10|doi=10.1016/S0262-8856(01)00045-2|volume=19|pages=699–707}}</ref>、[[核主成分分析]]（KPCA）<ref>{{Cite journal|title=A novel ensemble classifier of hyperspectral and LiDAR data using morphological features|last=Xia|first=Junshi|last2=Yokoya|first2=Naoto|date=March 2017|journal=2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)|doi=10.1109/ICASSP.2017.7953345|pages=6185-6189|last3=Iwasaki|first3=Yakira}}</ref>、[[Boosting (meta-algorithm)|Boosting]]<ref>{{Cite journal|title=Accuracy comparison of land cover mapping using the object-oriented image classification with machine learning algorithms|last=Mochizuki|first=S.|last2=Murakami|first2=T.|date=November 2012|journal=33rd Asian Conference on Remote Sensing 2012, ACRS 2012|volume=1|pages=126-133}}</ref>决策树、[[随机森林]]<ref name="rodriguez" />和自动设计多分类器系统<ref>{{Cite journal|title=Design of effective multiple classifier systems by clustering of classifiers|url=https://archive.org/details/multipleclassifi00kitt|last=Giacinto|first=G.|last2=Roli|first2=F.|date=September 2000|journal=Proceedings 15th International Conference on Pattern Recognition. ICPR-2000|doi=10.1109/ICPR.2000.906039|pages=[https://archive.org/details/multipleclassifi00kitt/page/n166 160]-163|last3=Fumera|first3=G.}}</ref>等不同的集成学习方法可以有效识别土地覆盖物。

==== 变化的检测 ====
变化检测是一种[[图像分析]]问题，识别土地覆盖随时间变化的地方。变化检测广泛应用于[[城市發展|城市发展]]、森林和植被动态、土地利用和灾害监测等领域。<ref name="s2">{{Cite journal|title=Information fusion techniques for change detection from multi-temporal remote sensing images|last=Du|first=Peijun|last2=Liu|first2=Sicong|date=January 2013|journal=Information Fusion|issue=1|doi=10.1016/j.inffus.2012.05.003|volume=14|pages=19–27|last3=Xia|first3=Junshi|last4=Zhao|first4=Yindi}}</ref>集成分类器在变化检测中的最早应用是通过多数投票、贝叶斯平均和最大后验概率设计的。<ref>{{Cite journal|title=Combining parametric and non-parametric algorithms for a partially unsupervised classification of multitemporal remote-sensing images|last=Bruzzone|first=Lorenzo|last2=Cossu|first2=Roberto|date=December 2002|journal=Information Fusion|issue=4|doi=10.1016/S1566-2535(02)00091-X|volume=3|pages=289–297|last3=Vernazza|first3=Gianni}}</ref>

=== 计算机安全 ===

==== 分布式拒绝服务 ====
[[分布式拒绝服务]]攻击是互联网服务提供商可能遭受的最具威胁性的网络攻击之一。<ref name="s1">{{Cite journal|title=A survey of multiple classifier systems as hybrid systems|last=Woźniak|first=Michał|last2=Graña|first2=Manuel|date=March 2014|journal=Information Fusion|doi=10.1016/j.inffus.2013.04.006|volume=16|pages=3–17|last3=Corchado|first3=Emilio}}</ref>通过组合单个分类器的输出，集成分类器减少了检测和区分此类攻击与[[Slashdot效应]]的总误差。<ref>{{Cite journal|title=Distributed denial of service attack detection using an ensemble of neural classifier|last=Raj Kumar|first=P. Arun|last2=Selvakumar|first2=S.|date=July 2011|journal=Computer Communications|issue=11|doi=10.1016/j.comcom.2011.01.012|volume=34|pages=1328–1341}}</ref>

==== 恶意软件检测 ====
使用机器学习技术对[[计算机病毒]]、[[计算机蠕虫]]、[[特洛伊木马 (电脑)|特洛伊木马]]、[[勒索軟體|勒索软件]]和[[间谍软件]]等[[恶意软件]]代码进行分类，其灵感来自[[文档分类|文本分类问题]]。<ref>{{Cite journal|title=Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey|last=Shabtai|first=Asaf|last2=Moskovitch|first2=Robert|date=February 2009|journal=Information Security Technical Report|issue=1|doi=10.1016/j.istr.2009.03.003|volume=14|pages=16–29|last3=Elovici|first3=Yuval|last4=Glezer|first4=Chanan}}</ref> 集成学习系统在这方面已经显示出适当的功效。<ref>{{Cite journal|title=Malicious Codes Detection Based on Ensemble Learning|last=Zhang|first=Boyun|last2=Yin|first2=Jianping|date=2007|journal=Autonomic and Trusted Computing|doi=10.1007/978-3-540-73547-2_48|pages=468-477|last3=Hao|first3=Jingbo|last4=Zhang|first4=Dingxing|last5=Wang|first5=Shulin}}</ref><ref>{{Cite journal|title=Improving malware detection by applying multi-inducer ensemble|last=Menahem|first=Eitan|last2=Shabtai|first2=Asaf|date=February 2009|journal=Computational Statistics & Data Analysis|issue=4|doi=10.1016/j.csda.2008.10.015|volume=53|pages=1483–1494|last3=Rokach|first3=Lior|last4=Elovici|first4=Yuval}}</ref>

==== 入侵检测 ====
[[入侵检测系统]]监控[[计算机网络]]或[[计算机系统]]，以识别入侵者代码，如异常检测过程。集成学习成功地帮助这种监控系统减少了它们的总误差。<ref>{{Cite journal|title=FLIPS: Hybrid Adaptive Intrusion Prevention|last=Locasto|first=Michael E.|last2=Wang|first2=Ke|date=2005|journal=Recent Advances in Intrusion Detection|doi=10.1007/11663812_5|pages=82-101|last3=Keromytis|first3=Angeles D.|last4=Salvatore|first4=J. Stolfo}}</ref><ref>{{Cite journal|title=Intrusion detection in computer networks by a modular ensemble of one-class classifiers|last=Giacinto|first=Giorgio|last2=Perdisci|first2=Roberto|date=January 2008|journal=Information Fusion|issue=1|doi=10.1016/j.inffus.2006.10.002|volume=9|pages=69–82|last3=Del Rio|first3=Mauro|last4=Roli|first4=Fabio}}</ref>

=== 人脸识别 ===
人脸识别最近已经成为最受欢迎的模式识别研究领域之一，它通过他/她的数字图像来处理人的识别或验证。<ref>{{Cite journal|title=Weighted voting-based ensemble classifiers with application to human face recognition and voice recognition|last=Mu|first=Xiaoyan|last2=Lu|first2=Jiangfeng|date=July 2009|journal=2009 International Joint Conference on Neural Networks|doi=10.1109/IJCNN.2009.5178708|last3=Watta|first3=Paul|last4=Hassoun|first4=Mohamad H.}}</ref>

基于Gabor Fisher分类器和独立分量分析预处理技术的分层集成是该领域中最早使用的一些集成方法。<ref>{{Cite journal|title=Hierarchical ensemble of Gabor Fisher classifier for face recognition|last=Yu|first=Su|last2=Shan|first2=Shiguang|date=April 2006|journal=Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on Automatic Face and Gesture Recognition (FGR06)|doi=10.1109/FGR.2006.64|pages=91-96|last3=Chen|first3=Xilin|last4=Gao|first4=Wen}}</ref><ref>{{Cite journal|title=Patch-based gabor fisher classifier for face recognition|last=Su|first=Y.|last2=Shan|first2=S.|date=September 2006|journal=Proceedings - International Conference on Pattern Recognition|doi=10.1109/ICPR.2006.917|volume=2|pages=528-531|last3=Chen|first3=X.|last4=Gao|first4=W.}}</ref><ref>{{Cite journal|title=Ensemble Classification Based on ICA for Face Recognition|url=https://ieeexplore.ieee.org/document/4566462/?part=1%7Csec1|last=Liu|first=Yang|last2=Lin|first2=Yongzheng|date=July 2008|journal=Proceedings - 1st International Congress on Image and Signal Processing, IEEE Conference, CISP 2008|doi=10.1109/CISP.2008.581|pages=144-148|last3=Chen|first3=Yuehui|access-date=2019-01-01|archive-date=2019-06-09|archive-url=https://web.archive.org/web/20190609022930/https://ieeexplore.ieee.org/document/4566462/?part=1%7Csec1|dead-url=no}}</ref>

=== 情感识别 ===
语音识别主要基于深度学习，因为谷歌、微软和IBM这一领域的大多数业内人士都表示，他们的语音识别的核心技术是基于这种方法。基于语音与集成学习的情感识别也可以有令人满意的表现。<ref>{{Cite journal|title=Speech based emotion recognition using spectral feature extraction and an ensemble of kNN classifiers|last=Rieger|first=Steven A.|last2=Muraleedharan|first2=Rajani|date=2014|journal=Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, ISCSLP 2014|doi=10.1109/ISCSLP.2014.6936711|pages=589-593|last3=Ramachandran|first3=Ravi P.}}</ref><ref>{{Cite journal|title=Comparing Multiple Classifiers for Speech-Based Detection of Self-Confidence - A Pilot Study|last=Krajewski|first=Jarek|last2=Batliner|first2=Anton|date=October 2010|journal=2010 20th International Conference on Pattern Recognition|doi=10.1109/ICPR.2010.905|pages=3716-3719|last3=Kessel|first3=Silke}}</ref>

它也被成功用于面部情绪识别。<ref>{{Cite journal|title=Recognize the facial emotion in video sequences using eye and mouth temporal Gabor features|last=Rani|first=P. Ithaya|last2=Muneeswaran|first2=K.|date=25 May 2016|journal=Multimedia Tools and Applications|issue=7|doi=10.1007/s11042-016-3592-y|volume=76|pages=10017–10040}}</ref><ref>{{Cite journal|title=Facial Emotion Recognition Based on Eye and Mouth Regions|last=Rani|first=P. Ithaya|last2=Muneeswaran|first2=K.|date=August 2016|journal=International Journal of Pattern Recognition and Artificial Intelligence|issue=07|doi=10.1142/S021800141655020X|volume=30|pages=1655020}}</ref><ref>{{Cite journal|title=Emotion recognition based on facial components|last=Rani|first=P. Ithaya|last2=Muneeswaran|first2=K|date=28 March 2018|journal=Sādhanā|issue=3|doi=10.1007/s12046-018-0801-6|volume=43}}</ref>

=== 欺诈检测 ===
欺诈检测涉及银行欺诈的识别，例如[[洗錢|洗钱]]、信用卡欺诈和[[电信诈骗|电信欺诈]]，它们具有广泛的机器学习研究和应用领域。由于集成学习提高了正常行为建模的稳健性，因此有人提出将其作为检测银行和信用卡系统中此类欺诈案件和活动的有效技术。<ref>{{Cite journal|title=Bagging k-dependence probabilistic networks: An alternative powerful fraud detection tool|last=Louzada|first=Francisco|last2=Ara|first2=Anderson|date=October 2012|journal=Expert Systems with Applications|issue=14|doi=10.1016/j.eswa.2012.04.024|volume=39|pages=11583–11592}}</ref><ref>{{Cite journal|title=A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance|last=Sundarkumar|first=G. Ganesh|last2=Ravi|first2=Vadlamani|date=January 2015|journal=Engineering Applications of Artificial Intelligence|doi=10.1016/j.engappai.2014.09.019|volume=37|pages=368–377}}</ref>

=== 金融决策 ===
预测业务失败的准确性是财务决策中非常关键的问题。因此，不同的集成分类器被提出用于预测[[金融危机]]和财务困境。<ref name="#1">{{Cite journal|title=Stock fraud detection using peer group analysis|last=Kim|first=Yoonseong|last2=Sohn|first2=So Young|date=August 2012|journal=Expert Systems with Applications|issue=10|doi=10.1016/j.eswa.2012.02.025|volume=39|pages=8986–8992}}</ref>此外，在基于交易的操纵问题中，交易者试图通过买卖活动来操纵[[股價指數|股票价格]]，集成分类器需要分析股票市场数据的变化并检测股票价格操纵的可疑症状。<ref name="#1"/>

=== 医学生
集成分类器已成功应用于[[脑机接口|脑-机接口]]、[[蛋白质组学]]和[[醫學診斷|医学诊断]]，例如基于MRI数据集的神经认知障碍（即[[阿尔茨海默氏症]]或肌强直性营养不良）检测。<ref>{{Cite journal|title=Neurocognitive disorder detection based on feature vectors extracted from VBM analysis of structural MRI|last=Savio|first=A.|last2=García-Sebastián|first2=M.T.|date=August 2011|journal=Computers in Biology and Medicine|issue=8|doi=10.1016/j.compbiomed.2011.05.010|volume=41|pages=600–610|last3=Chyzyk|first3=D.|last4=Hernandez|first4=C.|last5=Graña|first5=M.|last6=Sistiaga|first6=A.|last7=López de Munain|first7=A.|last8=Villanúa|first8=J.}}</ref><ref>{{Cite journal|title=Meta-ensembles of classifiers for Alzheimer's disease detection using independent ROI features|last=Ayerdi|first=B.|last2=Savio|first2=A.|date=June 2013|journal=Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|issue=Part 2|doi=10.1007/978-3-642-38622-0_13|pages=122-130|last3=Graña|first3=M.}}</ref><ref>{{Cite journal|title=An ensemble classifier based prediction of G-protein-coupled receptor classes in low homology|last=Gu|first=Quan|last2=Ding|first2=Yong-Sheng|date=April 2015|journal=Neurocomputing|doi=10.1016/j.neucom.2014.12.013|volume=154|pages=110–118|last3=Zhang|first3=Tong-Liang}}</ref>


== 参考文献 ==
<references></references>

[[Category:集成学习| ]]
[[Category:机器学习]]
[[Category:有未审阅翻译的页面]]