查看“︁偏最小二乘回归”︁的源代码
←
偏最小二乘回归
跳转到导航
跳转到搜索
因为以下原因,您没有权限编辑该页面:
您请求的操作仅限属于该用户组的用户执行:
用户
您可以查看和复制此页面的源代码。
{{noteTA |T=zh:偏最小二乘回归;zh-tw:偏最小二乘迴歸;zh-cn:偏最小二乘回归; |G1=Math |1=zh:回归;zh-tw:迴歸;zh-cn:回归; }} {{回归侧栏}} '''偏最小二乘回归({{Lang-en|Partial least squares regression}}, PLS回归)'''是一种[[统计学]]方法,与[[主成分分析|主成分回归]]有关系,但不是寻找响应和独立变量之间最小[[方差]]的[[超平面]],而是通过投影[[预测变量]]和[[观测变量]]到一个新空间来寻找一个[[线性回归]]模型。因为数据''X''和''Y''都会投影到新空间,PLS系列的方法都被称为双线性因子模型。當Y是分类數據時有「偏最小二乘判别分析({{Lang-en|Partial least squares Discriminant Analysis}}, PLS-DA)」,是PLS的一个变形。 偏最小二乘用于查找两个[[矩阵]](''X''和''Y'')的基本关系,即一个在这两个空间对[[协方差]]结构建模的[[隐变量]]方法。偏最小二乘模型将试图找到''X''空间的多维方向来解释''Y''空间方差最大的多维方向。偏最小二乘回归特别适合当预测矩阵比观测的有更多变量,以及''X''的值中有[[多重共线性]]的时候。相比之下,标准的回归在这些情况下不见效(除非它是[[吉洪诺夫正则化]])。 偏最小二乘算法被用在[[偏最小二乘路径建模]]中,<ref>{{cite journal |last1=Tenenhaus |first1=M. |last2=Esposito Vinzi |first2=V. |last3=Chatelinc |first3=Y-M. |last4=Lauro |first4=C. |title=PLS path modeling |journal=Computational Statistics & Data Analysis |volume=48 |issue=1 |pages=159–205 |date=January 2005 |doi=10.1016/j.csda.2004.03.005 |url=http://www.stat.uni-muenchen.de/institut/ag/leisch/teaching/msl0910/PLS_path_modeling.pdf |format=PDF }}{{dead link|date=2017年11月 |bot=InternetArchiveBot |fix-attempted=yes }}</ref><ref>{{cite book |editor1-last=Vinzi |editor1-first=V. |editor2-last=Chin |editor2-first=W.W. |editor3-last=Henseler |editor3-first=J. |editor4-last=Wang |editor4-first=H. |title=Handbook of Partial Least Squares |year=2010 |isbn=978-3-540-32825-4 }}</ref> 一个建立[[隐变量]](原因不能没有实验和拟实验来确定,但一个典型的模型会基于之前理论假设(隐变量影响衡量指标的表现)的隐变量模型)这种技术是[[结构方程模型]]的一种形式,与经典方法不同的是基于组件而不是基于协方差。<ref>{{cite web |last=Tenenhaus |first=M. |title=Component-based structural equation modelling |year=2008 |url=http://www.hec.edu/var/fre/storage/original/application/888adcb77e8c8378551e689551ecc0a1.pdf |format=PDF |access-date=2015-09-28 |archive-url=https://web.archive.org/web/20131103074710/http://www.hec.edu/var/fre/storage/original/application/888adcb77e8c8378551e689551ecc0a1.pdf |archive-date=2013-11-03 |dead-url=yes }}</ref> 偏最小二乘来源于瑞典统计学家[[Herman Wold]],然后由他的儿子Svante Wold发展。偏最小二乘的另一个词(根据Svante Wold<ref name="wold_2001">{{cite journal |last1=Wold |first1=S |last2=Sjöström |first2=M. |last3=Eriksson |first3=L. |title=PLS-regression: a basic tool of chemometrics |journal=Chemometrics and Intelligent Laboratory Systems |volume=58 |issue=2 |pages=109–130 |year=2001 |doi=10.1016/S0169-7439(01)00155-1 |url=http://www.sciencedirect.com/science/article/pii/S0169743901001551 |access-date=2015-09-28 |archive-date=2017-09-09 |archive-url=https://web.archive.org/web/20170909154206/http://www.sciencedirect.com/science/article/pii/S0169743901001551 |dead-url=no }}</ref>)是''投影到潜在结构'',但偏最小二乘法依然在许多领域占据着主导地位。尽管最初的应用是在社会科学中,偏最小二乘回归今天被广泛用于[[化学计量学]]和相关领域。它也被用于生物信息学,sensometrics,神经科学和人类学。而相比之下,偏最小二乘回歸最常用于社会科学、计量经济学、市场营销和战略管理。 ==底层模型== 偏最小二乘的一般多元底层模型是 :<math>X = T P^{\top} + E</math> :<math>Y = U Q^{\top} + F</math> 其中<math>X</math>是一个<math>n \times m</math>的预测矩阵,<math>Y</math>是一个<math>n \times p</math>的响应矩阵;<math>T</math>和<math>U</math>是<math>n \times l</math>的矩阵,分别为<math>X</math>的投影(“X分数”、“组件”或“因子”矩阵)和<math>Y</math>的投影(“Y分数”);<math>P</math>和<math>Q</math>分别是<math>m \times l</math>和<math>p \times l</math>的正交''载荷''矩阵,以及矩阵<math>E</math>和<math>F</math>是错误项,假设是独立同分布的随机正态变量。对<math>X</math>和<math>Y</math>分解来最大化<math>T</math>和<math>U</math>之间的[[协方差]]。 ==算法== 偏最小二乘的许多变量是为了估计因子和载荷矩阵<math>T, U, P</math>和<math>Q</math>。它们中大多数构造了<math>X</math>和<math>Y</math>之间线性回归的估计<math>Y = X \tilde{B} + \tilde{B}_0</math>。一些偏最小二乘算法只适合<math>Y</math>是一个列向量的情况,而其它的算法则处理了<math>Y</math>是一个矩阵的一般情况。算法也根据他们是否估计因子矩阵<math>T</math>为一个[[正交矩阵]]而不同。<ref>{{cite journal |last1=Lindgren |first1=F |last2=Geladi |first2=P |last3=Wold |first3=S |title=The kernel algorithm for PLS |journal=J. Chemometrics |volume=7 |pages=45–59 |year=1993 |doi=10.1002/cem.1180070104 |url=http://onlinelibrary.wiley.com/doi/10.1002/cem.1180070104/abstract |access-date=2015-09-28 |archive-date=2017-06-07 |archive-url=https://web.archive.org/web/20170607055019/http://onlinelibrary.wiley.com/doi/10.1002/cem.1180070104/abstract |dead-url=no }}</ref><ref>{{cite journal |last1=de Jong |first1=S. |last2=ter Braak |first2=C.J.F. |title=Comments on the PLS kernel algorithm |journal=J. Chemometrics |volume=8 |issue=2 |pages=169–174 |year=1994 |doi=10.1002/cem.1180080208 |url=http://onlinelibrary.wiley.com/doi/10.1002/cem.1180080208/abstract |access-date=2015-09-28 |archive-date=2016-07-13 |archive-url=https://web.archive.org/web/20160713105410/http://onlinelibrary.wiley.com/doi/10.1002/cem.1180080208/abstract |dead-url=no }}</ref><ref>{{cite journal |last1=Dayal |first1=B.S. |last2=MacGregor |first2=J.F. |title=Improved PLS algorithms |journal=J. Chemometrics |volume=11 |issue=1 |pages=73–85 |year=1997 |doi=10.1002/(SICI)1099-128X(199701)11:1<73::AID-CEM435>3.0.CO;2-# |url=http://onlinelibrary.wiley.com/doi/10.1002/%28SICI%291099-128X%28199701%2911:1%3C73::AID-CEM435%3E3.0.CO;2-%23/abstract |access-date=2015-09-28 |archive-date=2016-07-04 |archive-url=https://web.archive.org/web/20160704210632/http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23/abstract |dead-url=no }}</ref><ref>{{cite journal |last=de Jong |first=S. |title=SIMPLS: an alternative approach to partial least squares regression |journal=Chemometrics and Intelligent Laboratory Systems |volume=18 |pages=251–263 |year=1993 |doi=10.1016/0169-7439(93)85002-X |issue=3 }}</ref><ref>{{cite journal |last1=Rannar |first1=S. |last2=Lindgren |first2=F. |last3=Geladi |first3=P. |last4=Wold |first4=S. |title=A PLS Kernel Algorithm for Data Sets with Many Variables and Fewer Objects. Part 1: Theory and Algorithm |journal=J. Chemometrics |volume=8 |issue=2 |pages=111–125 |year=1994 |doi=10.1002/cem.1180080204 |url=http://onlinelibrary.wiley.com/doi/10.1002/cem.1180080204/abstract |access-date=2015-09-28 |archive-date=2016-10-17 |archive-url=https://web.archive.org/web/20161017104534/http://onlinelibrary.wiley.com/doi/10.1002/cem.1180080204/abstract |dead-url=no }}</ref><ref>{{cite journal |last=Abdi |first=H. |title=Partial least squares regression and projection on latent structure regression (PLS-Regression) |journal=Wiley Interdisciplinary Reviews: Computational Statistics |volume=2 |pages=97–106 |year=2010 |doi=10.1002/wics.51 }}</ref> 最后的预测在所有不同最小二乘算法中都是一样的,但组件是不同的。 ===PLS1=== PLS1是一个<math>Y</math>是向量时广泛使用的算法。它估计<math>T</math>是一个正交矩阵。以下是伪代码(大写字母是矩阵,带上标的小写字母是向量,带下标的小写字母和单独的小写字母都是标量): 1 '''function''' PLS1(<math>X, y, l</math>) 2 <math>X^{(0)} \gets X</math> 3 <math>w^{(0)} \gets X^T y/||X^Ty||</math>, an initial estimate of <math>w</math>. 4 <math>t^{(0)} \gets X w^{(0)}</math> 5 '''for''' <math>k</math> = 0 '''to''' <math>l</math> 6 <math>t_k \gets {t^{(k)}}^T t^{(k)}</math> (note this is a scalar) 7 <math>t^{(k)} \gets t^{(k)} / t_k</math> 8 <math>p^{(k)} \gets {X^{(k)}}^T t^{(k)}</math> 9 <math>q_k \gets {y}^T t^{(k)}</math> (note this is a scalar) 10 '''if''' <math>q_k</math> = 0 11 <math>l \gets k</math>, '''break''' the '''for loop''' 12 '''if''' <math>k < l</math> 13 <math>X^{(k+1)} \gets X^{(k)} - t_k t^{(k)} {p^{(k)}}^T</math> 14 <math>w^{(k+1)} \gets {X^{(k+1)}}^T y </math> 15 <math>t^{(k+1)} \gets X^{(k+1)}w^{(k+1)}</math> 16 end '''for''' 17 '''define''' <math>W</math> to be the matrix with columns <math>w^{(0)},w^{(1)},...,w^{(l-1)}</math>. Do the same to form the <math>P</math> matrix and <math>q</math> vector. 18 <math>B \gets W {(P^T W)}^{-1} q</math> 19 <math>B_0 \gets q_0 - {P^{(0)}}^T B</math> 20 '''return''' <math>B, B_0</math> 这种形式的算法不需要输入<math>X</math>和<math>Y</math>定中心,因为算法隐式处理了。这个算法的特点是收缩于<math>X</math> (减去<math>t_k t^{(k)} {p^{(k)}}^T</math>),但向量<math>y</math>不收缩,因为没有必要(可以证明收缩<math>y</math>和不收缩的结果是一样的)。用户提供的变量<math>l</math>是回归中隐藏因子数量的限制;如果它等于矩阵<math>X</math>的秩,算法将产生<math>B</math>和<math>B_0</math>的最小二乘回归估计。 ==扩展== 2002年,一个叫做正交投影({{Lang-en|Orthogonal Projections to Latent Structures}}, OPLS)的方法提出。在OPLS中,连续变量数据被分为预测的和不相关的信息。这有利于改进诊断,以及更容易解释可视化。然而,这些变化只是改善模型的可解释性,不是预测能力。<ref>{{Cite journal | last = Trygg | first = J | last2 = Wold | first2 = S | title = Orthogonal Projections to Latent Structures | journal = Journal of Chemometrics | volume = 16 | issue = 3 | pages = 119–128 | year = 2002 | url = | doi = 10.1002/cem.695}} </ref> L-PLS通过3个连接数据块扩展了偏最小二乘回归。<ref>{{cite journal |last1=Sæbøa |first1=S. |last2=Almøya |first2=T. |last3=Flatbergb |first3=A. |last4=Aastveita |first4=A.H. |last5=Martens |first5=H. |title=LPLS-regression: a method for prediction and classification under the influence of background information on predictor variables |journal=Chemometrics and Intelligent Laboratory Systems |volume=91 |issue=2 |pages=121–132 |year=2008 |doi=10.1016/j.chemolab.2007.10.006 }}</ref> 同样,OPLS-DA({{Lang-en|Discriminant Analysis}}, 判别分析)可能被应用在处理离散变量,如分类和生物标志物的研究。 ==软件实现== 大多数统计软件包都提供偏最小二乘回归。{{citation needed}} R中的‘pls’包提供了一系列算法。<ref>{{cite web|url=http://cran.r-project.org/web/packages/pls/pls.pdf|title=Package ‘pls’|accessdate=2015-09-28|archive-date=2020-12-09|archive-url=https://web.archive.org/web/20201209111629/https://cran.r-project.org/web/packages/pls/pls.pdf|dead-url=no}}</ref> ==参见== *[[特征提取]] *[[数据挖掘]] *[[机器学习]] *[[回归分析]] *[[典型相关]] *[[Deming regression]] *[[多线性子空间学习]] *[[主成分分析]] *[[总平方和]] ==扩展阅读== *{{cite book |first=R. |last=Kramer |title=Chemometric Techniques for Quantitative Analysis |url=https://archive.org/details/chemometrictechn0000kram |publisher=Marcel-Dekker |year=1998 |isbn=0-8247-0198-4 }} *{{cite journal |last1=Frank |first1=Ildiko E. |first2=Jerome H. |last2=Friedman |title=A Statistical View of Some Chemometrics Regression Tools |journal=Technometrics |volume=35 |issue=2 |pages=109–148 |year=1993 |doi=10.1080/00401706.1993.10485033 |url=http://amstat.tandfonline.com/doi/full/10.1080/00401706.1993.10485033 |deadurl=yes |archiveurl=https://archive.today/20130203091137/http://amstat.tandfonline.com/doi/full/10.1080/00401706.1993.10485033 |archivedate=2013-02-03 |access-date=2015-09-28 }} *{{cite journal |last1=Haenlein |first1=Michael |first2=Andreas M. |last2=Kaplan | title=A Beginner's Guide to Partial Least Squares Analysis |journal=Understanding Statistics |volume=3 |issue=4 |pages=283–297| year=2004 |doi=10.1207/s15328031us0304_4 }} *{{cite journal |last1=Henseler |first1=Joerg |first2=Georg |last2=Fassott | title=Testing Moderating Effects in PLS Path Models. An Illustration of Available Procedures| year=2005 }} *{{cite journal |last1=Lingjærde |first1=Ole-Christian |first2=Nils |last2=Christophersen | title=Shrinkage Structure of Partial Least Squares |journal=Scandinavian Journal of Statistics |volume=27 |issue=3 |pages=459–473 | year=2000 |doi=10.1111/1467-9469.00201 }} *{{cite book | last=Tenenhaus |first=Michel | title= La Régression PLS: Théorie et Pratique. Paris: Technip.| year=1998}} *{{cite journal | last1=Rosipal |first1=Roman |first2=Nicole |last2=Kramer | title=Overview and Recent Advances in Partial Least Squares, in Subspace, Latent Structure and Feature Selection Techniques |pages=34–51 | year=2006}} *{{cite journal |last=Helland |first=Inge S. |title=PLS regression and statistical models |journal=Scandinavian Journal of Statistics |volume=17 |issue=2 |pages=97–114 |year=1990 |jstor=4616159}} *{{cite book |authorlink=Herman Wold |last=Wold |first=Herman |chapter=Estimation of principal components and related models by iterative least squares |editor-first=P.R. |editor-last=Krishnaiaah |title=Multivariate Analysis |publisher=Academic Press |location=New York |year=1966 |pages=391–420 }} *{{cite book |last=Wold |first=Herman |title=The fix-point approach to interdependent systems |publisher=North Holland |location=Amsterdam |year=1981 }} *{{cite book |last=Wold |first=Herman |chapter=Partial least squares |editor1-first=Samuel |editor1-last=Kotz |editor2-first=Norman L. |editor2-last=Johnson |title=Encyclopedia of statistical sciences |publisher=Wiley |location=New York |year=1985 |pages=581–591 |volume=6}} *{{cite journal |first1=Svante |last1=Wold |first2=Axel |last2=Ruhe |first3=Herman |last3=Wold |first4=W.J. |last4=Dunn |title=The collinearity problem in linear regression. the partial least squares (PLS) approach to generalized inverses |journal=SIAM Journal on Scientific and Statistical Computing |volume=5 |pages=735–743 |year=1984 |doi=10.1137/0905052 |issue=3 }} *{{cite journal |last=Garthwaite |first=Paul H. |title=An Interpretation of Partial Least Squares |url=https://archive.org/details/sim_journal-of-the-american-statistical-association_1994-03_89_425/page/122 |journal=[[Journal of the American Statistical Association]] |volume=89 |pages=122–7 |year=1994 |jstor=2291207 |doi=10.1080/01621459.1994.10476452 |issue=425}} *{{cite book |editor1-last=Wang |editor1-first=H. |title=Handbook of Partial Least Squares |year=2010 |isbn=978-3-540-32825-4 }} *{{cite journal |last1=Stone |first1=M. |last2=Brooks |first2=R.J. |title=Continuum Regression: Cross-Validated Sequentially Constructed Prediction embracing Ordinary Least Squares, Partial Least Squares and Principal Components Regression |url=https://archive.org/details/sim_journal-of-the-royal-statistical-society-series-b_1990_52_2/page/237 |journal=[[Journal of the Royal Statistical Society]], Series B |volume=52 |issue=2 |pages=237–269 |year=1990 |jstor=2345437}} *Wan Mohamad Asyraf Bin Wan Afthanorhan. (2013). A Comparison Of Partial Least Square Structural Equation Modeling (PLS-SEM) and Covariance Based Structural EquationModeling (CB-SEM) for Confirmatory Factor Analysis International Journal of Engineering Science and Innovative Technology (IJESIT), 2(5), 9. ==参考文献== {{Reflist}} ==外部链接== *[https://sourceforge.net/projects/imdev/ imDEV] {{Wayback|url=https://sourceforge.net/projects/imdev/ |date=20200509125807 }} free Excel add-in for PLS and PLS-DA *[http://www.rotman-baycrest.on.ca/pls PLS in Brain Imaging] *[http://www.vcclab.org/lab/pls on-line PLS] {{Wayback|url=http://www.vcclab.org/lab/pls |date=20200505202521 }} regression (PLSR) at Virtual Computational Chemistry Laboratory *[http://www.chemometry.com/Research/MVC.html Uncertainty estimation for PLS] {{Wayback|url=http://www.chemometry.com/Research/MVC.html |date=20201025155050 }} *[http://www.utd.edu/~herve/Abdi-PLSR2007-pretty.pdf A short introduction to PLS regression and its history] {{Wayback|url=http://www.utd.edu/~herve/Abdi-PLSR2007-pretty.pdf |date=20190713083031 }} {{Authority control}} {{DEFAULTSORT:Partial Least Squares Regression}} [[Category:回归分析]] [[Category:隐变量模型]] [[Category:最小二乘]] [[Category:带有伪代码示例的条目]]
该页面使用的模板:
Template:Authority control
(
查看源代码
)
Template:Citation needed
(
查看源代码
)
Template:Cite book
(
查看源代码
)
Template:Cite journal
(
查看源代码
)
Template:Cite web
(
查看源代码
)
Template:Dead link
(
查看源代码
)
Template:Lang-en
(
查看源代码
)
Template:NoteTA
(
查看源代码
)
Template:Reflist
(
查看源代码
)
Template:Wayback
(
查看源代码
)
Template:回归侧栏
(
查看源代码
)
返回
偏最小二乘回归
。
导航菜单
个人工具
登录
命名空间
页面
讨论
不转换
查看
阅读
查看源代码
查看历史
更多
搜索
导航
首页
最近更改
随机页面
MediaWiki帮助
特殊页面
工具
链入页面
相关更改
页面信息