查看“︁典型相关”︁的源代码

{{NoteTA
| G1 = Math
}}

在[[统计学]]中，'''典型相关分析'''（{{lang-en|Canonical Correlation Analysis}}）是对[[互协方差]]矩阵的一种理解。如果我们有两个[[随机变量]]向量 ''X'' = (''X''<sub>1</sub>, ..., ''X''<sub>''n''</sub>) 和 ''Y'' = (''Y''<sub>1</sub>, ..., ''Y''<sub>''m''</sub>) 并且它们是[[相關 (概率論)|相關]]的，那么典型相关分析会找出 ''X''<sub>''i''</sub> 和 ''Y''<sub>''j''</sub> 的相互相关最大的线性组合。<ref>{{Cite book | doi = 10.1007/978-3-540-72244-1_14 | chapter = Canonical Correlation Analysis | title = Applied Multivariate Statistical Analysis | pages = 321–330 | year = 2007 | isbn = 978-3-540-72243-4 | first1 = Wolfgang | last1 = Härdle| first2 = Léopold | last2 = Simar| pmid =  | pmc = }}</ref>T·R·Knapp指出“几乎所有常见的[[参数测试]]的意义可视为特殊情况的典型相关分析，这是研究两组变量之间关系的一般步骤。”<ref>{{Cite journal | last1 = Knapp | first1 = T. R. | title = Canonical correlation analysis: A general parametric significance-testing system | url = https://archive.org/details/sim_psychological-bulletin_1978-03_85_2/page/410 | doi = 10.1037/0033-2909.85.2.410 | journal = Psychological Bulletin | volume = 85 | issue = 2 | pages = 410–416 | year = 1978 | pmid =  | pmc = }}</ref> 这个方法在1936年由[[哈罗德·霍特林]]首次引入。<ref>{{Cite journal | last1 = Hotelling | first1 = H. | authorlink1 = Harold Hotelling| title = Relations Between Two Sets of Variates | doi = 10.1093/biomet/28.3-4.321 | journal = Biometrika | volume = 28 | issue = 3–4 | pages = 321–377 | year = 1936 | jstor = 2333955| pmid =  | pmc = }}</ref>

给定两个随机向量<math>X = (x_1, \dots, x_n)'</math>和<math>Y = (y_1, \dots, y_m)'</math>，我们可以定义[[互协方差]]矩阵 <math>\Sigma _{XY} = \operatorname{cov}(X, Y) </math> 为 <math> n \times m</math> 的[[矩阵]]，其中 <math>(i, j)</math> 是[[协方差]] <math>\operatorname{cov}(x_i, y_j)</math>。实际上，我们可以基于 <math>X</math> 和 <math>Y</math> 的采样数据来估计协方差矩阵。(如从一对数据矩阵)。

典型相关分析求出向量 <math>a</math> 和 <math>b</math> 使得随机变量 <math>a' X</math> 和 <math>b' Y</math> 的[[相關 (概率論)|相關]]性 <math>\rho = \operatorname{corr}(a' X, b' Y)</math> 最大。随机变量 <math>U = a' X</math> 和 <math>V = b' Y</math> 是 '''''第一对典型变量'''''。然后寻求一个依然最大化相关但与第一对典型变量不相关的向量；这样就得到了 '''''第二对典型变量'''''。 这个步骤会进行 <math>\min\{m,n\}</math> 次。

==计算==
===推导===
设 <math>\Sigma _{XX} = \operatorname{cov}(X, X)</math> 和 <math>\Sigma _{YY} = \operatorname{cov}(Y, Y)</math>。需要最大化的参数为

:<math>
\rho = \frac{a' \Sigma _{XY} b}{\sqrt{a' \Sigma _{XX} a} \sqrt{b' \Sigma _{YY} b}}.
</math>

第一步是定义一个[[基变更]]以及

:<math>
c = \Sigma _{XX} ^{1/2} a,
</math>

:<math>
d = \Sigma _{YY} ^{1/2} b.
</math>

因此我们有

:<math>
\rho = \frac{c' \Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1/2} d}{\sqrt{c' c} \sqrt{d' d}}.
</math>

根据[[柯西-施瓦茨不等式]]，我们有

:<math>
\left(c' \Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1/2} \right) d \leq \left(c' \Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1/2} \Sigma _{YY} ^{-1/2} \Sigma _{YX} \Sigma _{XX} ^{-1/2} c \right)^{1/2} \left(d' d \right)^{1/2},
</math>

:<math>
\rho \leq \frac{\left(c' \Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1} \Sigma _{YX} \Sigma _{XX} ^{-1/2} c \right)^{1/2}}{\left(c' c \right)^{1/2}}.
</math>

如果向量 <math>d</math> 和 <math>\Sigma _{YY} ^{-1/2} \Sigma _{YX} \Sigma _{XX} ^{-1/2} c</math> 共线，那么上式相等。此外，如果 <math>c</math> 是矩阵 <math>\Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1} \Sigma _{YX} \Sigma _{XX} ^{-1/2}</math> (见[[Rayleigh quotient]]) 最大特征值对应的[[特征向量]]，那么就可以得到相关的最大值。随后的典型变量对可以通过减少[[特征值]]的量级来得到。正交性保证了相关矩阵的对称性。

===解法===
因此解法是：
* <math>c</math> 是 <math>\Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1} \Sigma _{YX} \Sigma _{XX} ^{-1/2}</math> 的一个特征向量。
* <math>d</math> 是 <math>\Sigma _{YY} ^{-1/2} \Sigma _{YX} \Sigma _{XX} ^{-1/2} c</math> 的比例项。

相反地，也有：
* <math>d</math> 是 <math>\Sigma _{YY} ^{-1/2} \Sigma _{YX} \Sigma _{XX} ^{-1} \Sigma _{XY} \Sigma _{YY} ^{-1/2}</math> 的一个特征向量。
* <math>c</math> 是 <math>\Sigma _{XX} ^{-1/2} \Sigma _{XY} \Sigma _{YY} ^{-1/2} d</math> 的比例项。

把坐标反过来，我们有
* <math>a</math> 是 <math>\Sigma _{XX} ^{-1} \Sigma _{XY} \Sigma _{YY} ^{-1} \Sigma _{YX}</math> 的一个特征向量。
* <math>b</math> 是 <math>\Sigma _{YY} ^{-1} \Sigma _{YX} \Sigma _{XX} ^{-1} \Sigma _{XY}</math> 的一个特征向量。
* <math>a</math> 是 <math>\Sigma _{XX} ^{-1} \Sigma _{XY} b</math> 的比例项。
* <math>b</math> 是 <math>\Sigma _{YY} ^{-1} \Sigma _{YX} a</math> 的比例项。

那么相关变量定义为：

:<math>U = c' \Sigma _{XX} ^{-1/2} X = a' X</math>

:<math>V = d' \Sigma _{YY} ^{-1/2} Y = b' Y</math>

===实现===
典型相关分析可以用一个相关矩阵的[[奇异值分解]]来解决。<ref>{{Cite journal | last1 = Hsu | first1 = D. | last2 = Kakade | first2 = S. M. | last3 = Zhang | first3 = T. | doi = 10.1016/j.jcss.2011.12.025 | title = A spectral algorithm for learning Hidden Markov Models | journal = Journal of Computer and System Sciences | volume = 78 | issue = 5 | pages = 1460 | year = 2012 | url = http://www.cs.mcgill.ca/~colt2009/papers/011.pdf | arxiv = 0811.4413 | pmid =  | pmc =  | access-date = 2015-09-10 | archive-date = 2020-10-01 | archive-url = https://web.archive.org/web/20201001073457/https://www.cs.mcgill.ca/~colt2009/papers/011.pdf | dead-url = no }}</ref> 以下是它在一些语言中的函数 <ref>{{Cite journal | last1 = Huang | first1 = S. Y. | last2 = Lee | first2 = M. H. | last3 = Hsiao | first3 = C. K. | doi = 10.1016/j.jspi.2008.10.011 | title = Nonlinear measures of association with kernel canonical correlation analysis and applications | journal = Journal of Statistical Planning and Inference | volume = 139 | issue = 7 | pages = 2162 | year = 2009 | url = http://www.stat.sinica.edu.tw/syhuang/papersdownload/KCCA-080906.pdf | pmid =  | pmc =  | access-date = 2015-09-10 | archive-date = 2017-03-13 | archive-url = https://web.archive.org/web/20170313203427/http://www.stat.sinica.edu.tw/syhuang/papersdownload/KCCA-080906.pdf | dead-url = no }}</ref>

*[[MATLAB]] as [http://www.mathworks.co.uk/help/stats/canoncorr.html canoncorr] {{Wayback|url=http://www.mathworks.co.uk/help/stats/canoncorr.html |date=20140830030857 }}
*[[R (programming language)|R]] as [http://stat.ethz.ch/R-manual/R-devel/library/stats/html/cancor.html cancor] {{Wayback|url=http://stat.ethz.ch/R-manual/R-devel/library/stats/html/cancor.html |date=20200917212423 }}  or in [http://factominer.free.fr/ FactoMineR] {{Wayback|url=http://factominer.free.fr/ |date=20201109000911 }}
*[[SAS language|SAS]] as [https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=statug&docsetTarget=statug_cancorr_example.htm&locale=en The CANCORR Procedure] {{Wayback|url=https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=statug&docsetTarget=statug_cancorr_example.htm&locale=en |date=20200915021737 }}
*[[Scikit-Learn]], [[Python]] as [http://scikit-learn.org/stable/modules/cross_decomposition.html Cross decomposition] {{Wayback|url=http://scikit-learn.org/stable/modules/cross_decomposition.html |date=20200918133514 }}

==假设检验==
每一行可以用下面的方法检测其重要性。由于相关是排好序的，也就是说行 <math>i</math> 为 0 意味着所有后续的相关都为 0。如果我们在一个样本中有 <math>p</math> 个独立观测，对 <math>i = 1,\dots, \min\{m,n\}</math>，<math>\widehat{\rho}_i</math> 是其估计相关。对第 <math>i</math> 行，测试统计为：

:<math>\chi ^2 = - \left( p - 1 - \frac{1}{2}(m + n + 1)\right) \ln \prod _ {j = i} ^{\min\{m,n\}} (1 - \widehat{\rho}_j^2),</math>

上面渐近为一个对大 <math>p</math> 有 <math>(m - i + 1)(n - i + 1)</math> 个[[自由度]]的[[卡方分布]]。<ref>{{Cite book
 | author =[[Kanti V. Mardia]], J. T. Kent and J. M. Bibby
 | title = Multivariate Analysis
 | url =https://archive.org/details/multivariateanal0000mard
 | year = 1979
 | publisher =[[Academic Press]]
}}</ref> 由于所有从 <math> \min\{m,n\}</math> 到 <math>p</math> 的相关从逻辑上来说都是 0，所以在这一点之后的乘积都是不相关的。

==实际运用==

==例子==

==与principal angles的连接==

==参见==
*[[Generalized Canonical Correlation]]
*[[Multilinear subspace learning]]
*[[RV coefficient]]
*[[Principal angles]]
*[[主成分分析]]
*[[Regularized canonical correlation analysis]]
*[[奇异值分解]]
*[[Partial least squares regression]]

==参考文献==
{{Reflist|2}}

==外部链接==
* {{Cite journal | last1 = Hardoon | first1 = D. R. | last2 = Szedmak | first2 = S. | last3 = Shawe-Taylor | first3 = J. | doi = 10.1162/0899766042321814 | title = Canonical Correlation Analysis: An Overview with Application to Learning Methods | journal = Neural Computation | volume = 16 | issue = 12 | pages = 2639–2664 | year = 2004 | pmid =  15516276| pmc = }}
*[http://mpra.ub.uni-muenchen.de/12796/ A note on the ordinal canonical-correlation analysis of two sets of ranking scores] {{Wayback|url=http://mpra.ub.uni-muenchen.de/12796/ |date=20200918003744 }} (Also provides a FORTRAN program)- in J. of Quantitative Economics 7(2), 2009, pp. 173-199
*[http://ssrn.com/abstract=1331886 Representation-Constrained Canonical Correlation Analysis: A Hybridization of Canonical Correlation and Principal Component Analyses] (Also provides a FORTRAN program)- in J. of Applied Economic Sciences 4(1), 2009, pp. 115-124

[[Category:协方差与相关性]]
[[Category:多重变量分析]]