查看“︁流形正则化”︁的源代码
←
流形正则化
跳转到导航
跳转到搜索
因为以下原因,您没有权限编辑该页面:
您请求的操作仅限属于该用户组的用户执行:
用户
您可以查看和复制此页面的源代码。
[[File:Example of unlabeled data in semisupervised learning.png|thumb|250px|标记数据(黑、白圆圈)稀疏时,流形正则化可利用无标数据(灰色圆圈)将数据分类。无大量标记点时,[[监督学习]]算法智能学习非常简单的决策边界(上图)。基于邻点很可能属于同一类的假设,决策边界应避开含大量未标记点的区域。这也就是一种[[半监督学习]]。]] [[机器学习]]中,'''流形正则化'''(Manifold regularization)是一种利用数据集形状以约束应在数据集上被学习的函数的技术。在很多机器学习问题中,待学习数据不能涵盖整个输入空间。例如,[[人脸识别系统]]不需要分类所有图像,只需分类包含人脸的图像。流形学习技术假定相关数据子集来自[[流形]],是一种具有有用属性的数学结构;且待学习函数是光滑的,即不同标签的数据不应靠在一起,即在有大量数据的区域,标签函数不应快速变化。这样,流形正则化算法便可利用无标数据,通过推广的[[吉洪诺夫正则化]]推断哪些区域允许待学习函数快速变化,哪些区域不允许。流形正则化算法可将[[监督学习]]算法推广到[[半监督学习]]和[[转导 (机器学习)|转导]],因为当中有无标数据。流形正则化技术已被应用于医学成像、大地成像与物体识别等领域。 == 流形正则器 == === 动机 === 流形正则化是[[正则化 (数学)|正则化]]的一种。正则化是通过惩罚复杂解,以减少[[过拟合]]、确保问题[[适定性问题|良置]]的一系列技术。具体说,流形正则化扩展了应用于[[再生核希尔伯特空间]](RKHSs)的[[吉洪诺夫正则化]]。在RKHS的标准吉洪诺夫正则化下,学习算法试图从函数<math>\mathcal{H}</math>的假设空间中学习函数''f''。假设空间是RKHS,就是说与[[核方法|核]]''K''相关联,于是候选函数''f''都有[[范数]]<math>\left\| f \right\|_K</math>,代表候选函数在假设空间中的复杂度。算法会考虑候选函数的范数,以惩罚复杂函数。 形式化:给定一组有标训练数据<math>(x_1, y_1), \ldots, (x_{\ell}, y_{\ell})</math>,其中<math>x_i \in X, y_i \in Y</math>,以及[[损失函数]]''V''。基于吉洪诺夫正则化的学习算法将试图求解 : <math> \underset{f \in \mathcal{H}}{\arg\!\min} \frac{1}{\ell} \sum_{i=1}^{\ell} V(f(x_i), y_i) + \gamma \left\| f \right\|_K^2 </math> 其中<math>\gamma</math>是[[超参数优化|超参数]],用于控制算法对简单函数与更能拟合数据的函数的偏好。 [[File:Swissroll manifold unrolled.png|thumb|300x300px|嵌入3维空间的2维[[流形]](左)。流形正则化试图学习在展开流形上光滑的函数(右)。]] 流形正则化在标准吉洪诺夫正则化的环境正则项(ambient regularizer)上增加了第二个正则化项——内蕴正则项(intrinsic regularizer)。在[[流形假设]]下,数据不是来自整个输入空间''X'',而是来自非线性[[流形]]<math>M\subset X</math>。流形(即内蕴空间)的几何用于确定正则化范数。<ref name="Belkin et al. 2006">{{Cite journal| volume = 7| pages = 2399–2434| last1 = Belkin| first1 = Mikhail| last2 = Niyogi| first2 = Partha| last3 = Sindhwani| first3 = Vikas| title = Manifold regularization: A geometric framework for learning from labeled and unlabeled examples| journal = The Journal of Machine Learning Research| access-date = 2015-12-02| date = 2006| url = http://dl.acm.org/citation.cfm?id=1248632}}</ref> === 拉普拉斯范数 === 内蕴正则项<math>\left\| f \right\|_I</math>有很多选择。如[[微分几何|流形上的梯度]]<math> \nabla_{M} </math>,可以衡量目标函数的光滑程度。光滑函数应在输入数据密集处变化较慢,即梯度<math> \nabla_{M} f(x) </math>与边际概率密度(marginal probability density)<math>\mathcal{P}_X(x) </math>(随机选定的数据点落在''x''处的[[概率密度]])呈负相关。这就为内蕴正则项提供了合适的选择: : <math> \left\| f \right\|_I^2 = \int_{x \in M} \left\| \nabla_{M} f(x) \right\|^2 \, d \mathcal{P}_X(x) </math> 实践中,由于边际概率密度<math>\mathcal{P}_X</math>未知,无法直接计算范数,但可根据数据进行估计。 === 基于图的拉普拉斯范数 === 将输入点间距解释为图,图的[[拉普拉斯矩阵]]就可帮助估计边际分布。假设输入数据包括<math>\ell</math>个有标例子(输入''x''与标签''y''的点对)、''u''个无标例子(无对应标签的输入)。定义''W''为图的边权重矩阵,<math>W_{ij}</math>是数据点<math>x_i,\ x_j</math>间的距离。定义''D''为对角矩阵,其中<math>D_{ii} = \sum_{j=1}^{\ell + u} W_{ij}</math>。''L''是拉普拉斯矩阵<math>D-W</math>。则,随着数据点数<math>\ell + u</math>增加,''L''将收敛于[[拉普拉斯-贝尔特拉米算子]]<math>\Delta_{M}</math>,其是梯度<math>\nabla_M</math>的[[散度]]。<ref name="Hein et al. 2005">{{Cite book | publisher = Springer | pages = 470–485 | last1 = Hein | first1 = Matthias | last2 = Audibert | first2 = Jean-Yves | last3 = Von Luxburg | first3 = Ulrike | author3-link = Ulrike von Luxburg | title = Learning theory | volume = 3559 | chapter = From graphs to manifolds–weak and strong pointwise consistency of graph laplacians | date = 2005 | doi = 10.1007/11503415_32 | series = Lecture Notes in Computer Science | isbn = 978-3-540-26556-6 | citeseerx = 10.1.1.103.82 }}</ref><ref>{{Cite book| publisher = Springer | pages = 486–500 | last1 = Belkin | first1 = Mikhail | last2 = Niyogi | first2 = Partha | title = Learning theory | volume = 3559 | chapter = Towards a theoretical foundation for Laplacian-based manifold methods | date = 2005 | doi = 10.1007/11503415_33 | series = Lecture Notes in Computer Science | isbn = 978-3-540-26556-6 | citeseerx = 10.1.1.127.795 }}</ref>则若<math>\mathbf{f}</math>是''f''在数据处的值向量,<math>\mathbf{f} = [f(x_1), \ldots, f(x_{l+u})]^{\mathrm{T}}</math>,则就可估计内蕴范数: : <math> \left\| f \right\|_I^2 = \frac{1}{(\ell+u)^2} \mathbf{f}^{\mathrm{T}} L \mathbf{f} </math> 随着数据点数<math>\ell + u</math>增加,<math> \left\| f \right\|_I^2</math>的经验定义会收敛到已知<math>\mathcal{P}_X</math>时的定义。<ref name="Belkin et al. 2006" /> === 基于图的方法解正则化问题 === 用权重<math>\gamma_A,\ \gamma_I</math>作为环境正则项和内蕴正则项,最终的待解表达式变为 : <math> \underset{f \in \mathcal{H}}{\arg\!\min} \frac{1}{\ell} \sum_{i=1}^{\ell} V(f(x_i), y_i) + \gamma_A \left\| f \right\|_K^2 + \frac{\gamma_I}{(\ell+u)^2} \mathbf{f}^{\mathrm{T}} L \mathbf{f} </math> 与其他[[核方法]]类似,<math>\mathcal{H}</math>可能是无限维空间。因此,若正则化表达式无法明确求解,就不可能在整个空间中搜索解;相反,[[表示定理]]表明,在选择范数<math>\left\| f \right\|_I</math>的特定条件下,最优解<math>f^*</math>必须是以每个输入点为中心的核的线性组合:对某些权重<math>\alpha_i</math> : <math> f^*(x) = \sum_{i=1}^{\ell + u} \alpha_i K(x_i, x) </math> 利用这结果,可在<math>\alpha_i</math>的可能选择定义的有限维空间中搜索最优解<math>f^*</math>。<ref name="Belkin et al. 2006" /> === 拉普拉斯范数的泛函方法 === 图拉普拉斯之外的想法是利用邻域估计拉普拉斯量。这种方法类似于[[K-近邻算法|局部平均法]],但众所周知处理高维问题时扩展性很差。事实上,图拉普拉斯函数会受到[[维数灾难]]影响。<ref name="Hein et al. 2005" /> 幸运的是,通过更先进的泛函分析,可利用函数的预期光滑性进行估算:由核导数估计拉普拉斯算子的值<math>\partial_{1, j} K(x_i, x)</math>,其中<math>\partial_{1,j}</math>表示对第一个变量第''j''个坐标的偏导数。<ref>{{Cite arXiv|last1=Cabannes|first1=Vivien|last2=Pillaud-Vivien|first2=Loucas|last3=Bach|first3=Francis|last4=Rudi|first4=Alessandro|date=2021|title=Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning|class=stat.ML|eprint=2009.04324}}</ref> 这第二种方法与[[无网格法]]有关,同PDE中的[[有限差分法]]形成对比。 == 应用 == 选择适当的损失函数''V''、假设空间<math>\mathcal{H}</math>,流形正则化可推广到各种可用吉洪诺夫正则化表达的算法。两个常用例子是[[支持向量机]]和正则化最小二乘法。(正则化最小二乘包括岭回归;相关的LASSO、[[弹性网正则化]]等算法可被表为支持向量机。<ref>{{cite book |title=An Equivalence between the Lasso and Support Vector Machines |last=Jaggi|first=Martin |editor-last1=Suykens|editor-first1=Johan |editor-last2=Signoretto|editor-first2=Marco |editor-last3=Argyriou|editor-first3=Andreas |year=2014 |publisher=Chapman and Hall/CRC}}</ref><ref>{{cite conference |last1=Zhou |first1=Quan |last2=Chen |first2=Wenlin |last3=Song |first3=Shiji |last4=Gardner |first4=Jacob |last5=Weinberger |first5=Kilian |last6=Chen |first6=Yixin |title=A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing |url=https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9856 |conference=[[Association for the Advancement of Artificial Intelligence]] |access-date=2024-06-08 |archive-date=2022-06-25 |archive-url=https://web.archive.org/web/20220625233940/https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9856 |dead-url=no }}</ref>)这些算法的推广分别称作拉普拉斯正则化最小二乘(LapRLS)和拉普拉斯支持向量机(LapSVM)。<ref name="Belkin et al. 2006" /> === 拉普拉斯正则化最小二乘(LapRLS) === 正则化最小二乘(RLS)是一类[[回归分析]]算法:预测输入''x''的值<math>y = f(x)</math>,目标是使预测值接近数据的真实标签。RLS的设计目标是在正则化的前提下,最大限度减小预测值与真实标签之间的[[均方误差]]。岭回归是RLS的一种形式,一般来说RLS与结合了[[核方法]]的岭回归是一样的。{{Citation needed|reason=Kernel ridge regression can be seen to have the same form as RLS in a general RKHS, but it is difficult to find a source that discusses the connection in detail.|date=December 2015}}在吉洪诺夫正则化中,损失函数''V''的均方误差是RLS问题陈述的结果: : <math> f^* = \underset{f \in \mathcal{H}}{\arg\!\min} \frac{1}{\ell} \sum_{i=1}^{\ell} (f(x_i) - y_i)^2 + \gamma \left\| f \right\|_K^2 </math> 根据[[表示定理]],解可写作在数据点求值的核的加权和: : <math> f^*(x) = \sum_{i=1}^{\ell} \alpha_i^* K(x_i, x) </math> 解<math>\alpha^*</math>可得 : <math> \alpha^* = (K + \gamma \ell I)^{-1} Y </math> 其中''K''定义为核矩阵,<math>K_{ij} = K(x_i, x_j)</math>,''Y''是标签向量。 为流形正则化添加拉普拉斯项,得到拉普拉斯RLS的表达: : <math> f^* = \underset{f \in \mathcal{H}}{\arg\!\min} \frac{1}{\ell} \sum_{i=1}^{\ell} (f(x_i) - y_i)^2 + \gamma_A \left\| f \right\|_K^2 + \frac{\gamma_I}{(\ell+u)^2} \mathbf{f}^{\mathrm{T}} L \mathbf{f} </math> 再根据流形正则化的表示定理,可知 : <math> f^*(x) = \sum_{i=1}^{\ell + u} \alpha_i^* K(x_i, x) </math> 这就得到了向量<math>\alpha^*</math>的表达式。令''K''是上述核矩阵,''Y''是数据标签向量,''J''是<math> (\ell + u) \times (\ell + u) </math>分块矩阵<math>\begin{bmatrix} I_{\ell} & 0 \\ 0 & 0_u \end{bmatrix} </math>: : <math> \alpha^* = \underset{\alpha \in \mathbf{R}^{\ell + u}}{\arg\!\min} \frac{1}{\ell} (Y - J K \alpha)^{\mathrm{T}} (Y - J K \alpha) + \gamma_A \alpha^{\mathrm{T}} K \alpha + \frac{\gamma_I}{(\ell + u)^2} \alpha^{\mathrm{T}} K L K \alpha </math> 解是 : <math> \alpha^* = \left( JK + \gamma_A \ell I + \frac{\gamma_I \ell}{(\ell + u)^2} L K \right)^{-1} Y </math><ref name="Belkin et al. 2006" /> LapRLS已被用于传感器网络、<ref>{{Cite conference | publisher = Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999 | volume = 21 | pages = 988 | last1 = Pan | first1 = Jeffrey Junfeng | last2 = Yang | first2 = Qiang | last3 = Chang | first3 = Hong | last4 = Yeung | first4 = Dit-Yan | title = A manifold regularization approach to calibration reduction for sensor-network based tracking | book-title = Proceedings of the national conference on artificial intelligence | access-date = 2015-12-02 | date = 2006 | url = http://www.aaai.org/Papers/AAAI/2006/AAAI06-155.pdf | archive-date = 2022-07-29 | archive-url = https://web.archive.org/web/20220729014303/https://www.aaai.org/Papers/AAAI/2006/AAAI06-155.pdf | dead-url = no }}</ref> [[医学成像]]、<ref>{{Cite conference| publisher = IEEE| pages = 1628–1631| last1 = Zhang| first1 = Daoqiang| last2 = Shen| first2 = Dinggang| title = Semi-supervised multimodal classification of Alzheimer's disease| book-title = Biomedical Imaging: From Nano to Macro, 2011 IEEE International Symposium on| date = 2011| doi = 10.1109/ISBI.2011.5872715}}</ref><ref>{{Cite book| publisher = Springer| pages = 264–271| last1 = Park| first1 = Sang Hyun| last2 = Gao| first2 = Yaozong| last3 = Shi| first3 = Yinghuan| last4 = Shen| first4 = Dinggang| title = Machine Learning in Medical Imaging| volume = 8679| chapter = Interactive Prostate Segmentation Based on Adaptive Feature Selection and Manifold Regularization| date = 2014| doi = 10.1007/978-3-319-10581-9_33| series = Lecture Notes in Computer Science| isbn = 978-3-319-10580-2}}</ref> 物体检测、<ref>{{Cite journal| last = Pillai| first = Sudeep| title = Semi-supervised Object Detector Learning from Minimal Labels| access-date = 2015-12-15| url = http://people.csail.mit.edu/spillai/data/papers/ssl-cv-project-paper.pdf| archive-date = 2017-08-30| archive-url = https://web.archive.org/web/20170830013330/http://people.csail.mit.edu/spillai/data/papers/ssl-cv-project-paper.pdf| dead-url = no}}</ref> [[光谱学]]、<ref>{{Cite journal| volume = 11| issue = 1| pages = 416–419| last1 = Wan| first1 = Songjing| last2 = Wu| first2 = Di| last3 = Liu| first3 = Kangsheng| title = Semi-Supervised Machine Learning Algorithm in Near Infrared Spectral Calibration: A Case Study on Diesel Fuels| journal = Advanced Science Letters| date = 2012| doi=10.1166/asl.2012.3044}}</ref> [[文档分类]]、<ref>{{Cite journal| volume = 8| issue = 4| pages = 1011–1018| last1 = Wang| first1 = Ziqiang| last2 = Sun| first2 = Xia| last3 = Zhang| first3 = Lijie| last4 = Qian| first4 = Xu| title = Document Classification based on Optimal Laprls| journal = Journal of Software| date = 2013| doi=10.4304/jsw.8.4.1011-1018}}</ref> 药物-蛋白质相互作用、<ref>{{Cite journal| volume = 4| issue = Suppl 2| pages = –6| last1 = Xia| first1 = Zheng| last2 = Wu| first2 = Ling-Yun| last3 = Zhou| first3 = Xiaobo| last4 = Wong| first4 = Stephen TC| title = Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces| journal = BMC Systems Biology| date = 2010| citeseerx = 10.1.1.349.7173| doi = 10.1186/1752-0509-4-S2-S6| pmid = 20840733| pmc = 2982693| doi-access = free}}</ref> 压缩图像与视频等问题。<ref>{{Cite conference| publisher = ACM| pages = 161–168| last1 = Cheng| first1 = Li| last2 = Vishwanathan| first2 = S. V. N.| title = Learning to compress images and videos| book-title = Proceedings of the 24th international conference on Machine learning| access-date = 2015-12-16| date = 2007| url = http://dl.acm.org/citation.cfm?id=1273517}}</ref> === 拉普拉斯支持向量机(LapSVM) === [[支持向量机]](SVMs)是一系列算法,常用于数据[[统计分类|分类]]。直观说,SVM在类间画出边界,使最接近边界的数据尽量远离边界。这可直接表为[[线性规划]]问题,但也等同于带[[铰链损失]]的吉洪诺夫正则化,即<math>V(f(x), y) = \max(0, 1 - yf(x))</math>: : <math> f^* = \underset{f \in \mathcal{H}}{\arg\!\min} \frac{1}{\ell} \sum_{i=1}^{\ell} \max(0, 1 - y_if(x_i)) + \gamma \left\| f \right\|_K^2 </math><ref>{{Cite journal| volume = 48| issue = 1–3| pages = 115–136| last1 = Lin| first1 = Yi| last2 = Wahba| first2 = Grace| last3 = Zhang| first3 = Hao| last4 = Lee| first4 = Yoonkyung|author4-link= Yoonkyung Lee | title = Statistical properties and adaptive tuning of support vector machines| journal = Machine Learning| date = 2002| doi=10.1023/A:1013951620650| doi-access = free}}</ref><ref>{{Cite journal| volume = 6| pages = 69–87| last1 = Wahba| first1 = Grace| last2 = others| title = Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV| journal = Advances in Kernel Methods-Support Vector Learning| date = 1999| citeseerx = 10.1.1.53.2114}}</ref> 将内蕴正则化项加进去,就得到了LapSVM问题的陈述: : <math> f^* = \underset{f \in \mathcal{H}}{\arg\!\min} \frac{1}{\ell} \sum_{i=1}^{\ell} \max(0, 1 - y_if(x_i)) + \gamma_A \left\| f \right\|_K^2 + \frac{\gamma_I}{(\ell+u)^2} \mathbf{f}^{\mathrm{T}} L \mathbf{f} </math> 同样,表示定理允许用在数据点得值的核表示解: : <math> f^*(x) = \sum_{i=1}^{\ell + u} \alpha_i^* K(x_i, x) </math> 将问题重写为线性规划问题、求解[[对偶性 (最佳化)|对偶问题]]就可得到<math>\alpha</math>。令''K''是核矩阵、''J''是分块矩阵<math>\begin{bmatrix} I_{\ell} & 0 \\ 0 & 0_u \end{bmatrix} </math>,则解可写作 : <math>\alpha = \left( 2 \gamma_A I + 2 \frac{\gamma_I}{(\ell + u)^2} L K \right)^{-1} J^{\mathrm{T}} Y \beta^* </math> 其中<math>\beta^*</math>是对偶问题的解 :<math> \begin{align} & & \beta^* = \max_{\beta \in \mathbf{R}^{\ell}} & \sum_{i=1}^{\ell} \beta_i - \frac{1}{2} \beta^{\mathrm{T}} Q \beta \\ & \text{subject to} && \sum_{i=1}^{\ell} \beta_i y_i = 0 \\ & && 0 \le \beta_i \le \frac{1}{\ell}\; i = 1, \ldots, \ell \end{align} </math> ''Q''的定义是 : <math> Q = YJK \left( 2 \gamma_A I + 2 \frac{\gamma_I}{(\ell + u)^2} L K \right)^{-1} J^{\mathrm{T}} Y </math><ref name="Belkin et al. 2006" /> LapSVM已被应用于大地成像、<ref>{{Cite journal| volume = 48| issue = 11| pages = 4110–4121| last1 = Kim| first1 = Wonkook| last2 = Crawford| first2 = Melba M.|author2-link=Melba Crawford| title = Adaptive classification for hyperspectral image data using manifold regularization kernel machines| journal = IEEE Transactions on Geoscience and Remote Sensing| date = 2010| doi = 10.1109/TGRS.2010.2076287| s2cid = 29580629}}</ref><ref>{{Cite journal| volume = 31| issue = 1| pages = 45–54| last1 = Camps-Valls| first1 = Gustavo| last2 = Tuia| first2 = Devis| last3 = Bruzzone| first3 = Lorenzo| last4 = Atli Benediktsson| first4 = Jon| title = Advances in hyperspectral image classification: Earth monitoring with statistical learning methods| journal = IEEE Signal Processing Magazine| date = 2014| doi=10.1109/msp.2013.2279179| arxiv = 1310.5107| bibcode = 2014ISPM...31...45C| s2cid = 11945705}}</ref><ref>{{Cite conference| publisher = IEEE| pages = 1521–1524| last1 = Gómez-Chova| first1 = Luis| last2 = Camps-Valls| first2 = Gustavo| last3 = Muñoz-Marí| first3 = Jordi| last4 = Calpe| first4 = Javier| title = Semi-supervised cloud screening with Laplacian SVM| book-title = Geoscience and Remote Sensing Symposium, 2007. IGARSS 2007. IEEE International| date = 2007| doi = 10.1109/IGARSS.2007.4423098}}</ref> 医学成像、<ref>{{Cite book| publisher = Springer| pages = 82–90| last1 = Cheng| first1 = Bo| last2 = Zhang| first2 = Daoqiang| last3 = Shen| first3 = Dinggang| title = Medical Image Computing and Computer-Assisted Intervention–MICCAI 2012| volume = 7510| chapter = Domain transfer learning for MCI conversion prediction| issue = Pt 1| date = 2012| doi = 10.1007/978-3-642-33415-3_11| pmid = 23285538| pmc = 3761352| series = Lecture Notes in Computer Science| isbn = 978-3-642-33414-6}}</ref><ref>{{Cite journal| volume = 37| issue = 8| pages = 4155–4172| last1 = Jamieson| first1 = Andrew R.| last2 = Giger| first2 = Maryellen L.| last3 = Drukker| first3 = Karen| last4 = Pesce| first4 = Lorenzo L.| title = Enhancement of breast CADx with unlabeled dataa)| journal = Medical Physics| date = 2010| doi=10.1118/1.3455704| pmid = 20879576| pmc = 2921421| bibcode = 2010MedPh..37.4155J}}</ref><ref>{{Cite journal| volume = 1| issue = 2| pages = 151–155| last1 = Wu| first1 = Jiang| last2 = Diao| first2 = Yuan-Bo| last3 = Li| first3 = Meng-Long| last4 = Fang| first4 = Ya-Ping| last5 = Ma| first5 = Dai-Chuan| title = A semi-supervised learning based method: Laplacian support vector machine used in diabetes disease diagnosis| journal = Interdisciplinary Sciences: Computational Life Sciences| date = 2009| doi=10.1007/s12539-009-0016-2| pmid = 20640829| s2cid = 21860700}}</ref> 人脸识别、<ref>{{Cite journal| volume = 4| issue = 17| last1 = Wang| first1 = Ziqiang| last2 = Zhou| first2 = Zhiqiang| last3 = Sun| first3 = Xia| last4 = Qian| first4 = Xu| last5 = Sun| first5 = Lijun| title = Enhanced LapSVM Algorithm for Face Recognition.| journal = International Journal of Advancements in Computing Technology| access-date = 2015-12-16| date = 2012| url = http://search.ebscohost.com/login.aspx?direct=true&profile=ehost&scope=site&authtype=crawler&jrnl=20058039&AN=98908455&h=8QzzRizi2IKxCZ4EHJjzxbGY%2FQazcifd58fcAGEG17GiFk0wZE59DrEge0xfEGhXRqsBaMwuBNyenVSP6sjwsA%3D%3D&crl=c}}</ref> 机器维护、<ref>{{Cite journal| volume = 38| issue = 8| pages = 10199–10204| last1 = Zhao| first1 = Xiukuan| last2 = Li| first2 = Min| last3 = Xu| first3 = Jinwu| last4 = Song| first4 = Gangbing| title = An effective procedure exploiting unlabeled data to build monitoring system| journal = Expert Systems with Applications| date = 2011| doi=10.1016/j.eswa.2011.02.078}}</ref> [[脑机接口]]等问题。<ref>{{Cite journal| volume = 7| issue = 1| pages = 22–26| last1 = Zhong| first1 = Ji-Ying| last2 = Lei| first2 = Xu| last3 = Yao| first3 = D.| title = Semi-supervised learning based on manifold in BCI| journal = Journal of Electronics Science and Technology of China| access-date = 2015-12-16| date = 2009| url = http://www.journal.uestc.edu.cn/archives/2009/1/7/22-2677907.pdf| archive-date = 2016-03-04| archive-url = https://web.archive.org/web/20160304095901/http://www.journal.uestc.edu.cn/archives/2009/1/7/22-2677907.pdf| dead-url = no}}</ref> == 局限 == * 流形正则化假定不同标签的数据不在一起,这样就能从无标数据中提取信息。但这只适用于一部分问题。根据数据结构不同,可能要用不同的半监督或转导学习算法。<ref>{{Cite journal| last = Zhu| first = Xiaojin| title = Semi-supervised learning literature survey| date = 2005| citeseerx = 10.1.1.99.9681}}</ref> * 某些数据集中,函数的内蕴范数<math>\left\| f \right\|_I</math>可能非常接近环境范数<math>\left\| f \right\|_K</math>:例如,若数据由位于垂直线上的两类组成,则内蕴范数将等于环境范数。这时,即便数据符合光滑分离器假设,无标数据也无法对流形正则化学习到的解产生影响。与[[联合训练]]相关的方法已用于解决这一限制。<ref>{{Cite conference| publisher = ACM| pages = 976–983| last1 = Sindhwani| first1 = Vikas| last2 = Rosenberg| first2 = David S.| title = An RKHS for multi-view learning and manifold co-regularization| book-title = Proceedings of the 25th international conference on Machine learning| access-date = 2015-12-02| date = 2008| url = http://dl.acm.org/citation.cfm?id=1390279}}</ref> * 若有大量无标数据,则核矩阵''K''将变得极大,计算时间可能非常久。这时在线算法与流形的稀疏近似可能有所帮助。<ref>{{Cite book| pages = 393–407| last1 = Goldberg| first1 = Andrew| last2 = Li| first2 = Ming| last3 = Zhu| first3 = Xiaojin| title = Machine Learning and Knowledge Discovery in Databases| chapter = Online Manifold Regularization: A New Learning Setting and Empirical Study| volume = 5211| date = 2008| doi = 10.1007/978-3-540-87479-9_44| series = Lecture Notes in Computer Science| isbn = 978-3-540-87478-2}}</ref> == 另见 == * [[流形学习]] * [[流形假设]] * [[半监督学习]] * [[转导 (机器学习)]] * [[谱图论]] * [[再生核希尔伯特空间]] * [[吉洪诺夫正则化]] * [[微分几何]] == 参考文献 == {{Reflist}} == 外部链接 == === 软件 === * [http://manifold.cs.uchicago.edu/manifold_regularization/software.html ManifoldLearn库] {{Wayback|url=http://manifold.cs.uchicago.edu/manifold_regularization/software.html |date=20160521111400 }}与[http://www.dii.unisi.it/~melacci/lapsvmp/ Primal LapSVM库] {{Wayback|url=http://www.dii.unisi.it/~melacci/lapsvmp/ |date=20190822170139 }}在[[MATLAB]]中实现了LapRLS和LapSVM。 * [[C++]]的[http://dlib.net/ml.html Dlib库] {{Wayback|url=http://dlib.net/ml.html |date=20161028061508 }}包含线性流形正则化函数。 [[Category:机器学习]]
该页面使用的模板:
Template:Citation needed
(
查看源代码
)
Template:Cite arXiv
(
查看源代码
)
Template:Cite book
(
查看源代码
)
Template:Cite conference
(
查看源代码
)
Template:Cite journal
(
查看源代码
)
Template:Reflist
(
查看源代码
)
Template:Wayback
(
查看源代码
)
返回
流形正则化
。
导航菜单
个人工具
登录
命名空间
页面
讨论
不转换
查看
阅读
查看源代码
查看历史
更多
搜索
导航
首页
最近更改
随机页面
MediaWiki帮助
特殊页面
工具
链入页面
相关更改
页面信息