查看“︁對偶性 (最佳化)”︁的源代码

在[[最优化]]理論中的'''對偶'''（duality）或'''對偶性原則'''（duality principle）是指[[最佳化問題]]可以用兩種觀點來看待的理論，兩種觀點分別是「原始問題」（primal problem）及「對偶問題」（dual problem）。對偶問題的解提供了原始問題（假設是最小化問題）的下限<ref name="Boyd">{{cite book|title=Convex Optimization|first1=Stephen P.|last1=Boyd|first2=Lieven|last2=Vandenberghe|year=2004|publisher=Cambridge University Press|isbn=978-0-521-83378-3|url=https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf#page=230|format=pdf|access-date=October 15, 2011|page=216|archive-date=2021-05-09|archive-url=https://web.archive.org/web/20210509212522/https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf#page=230|dead-url=no}}</ref>，不過一般而言，原始問題和對偶問題的最佳解不相同。兩個最佳解的差距為[[對偶間隙]]。若是[[凸優化]]問題，對偶間隙也稱為是[[卡鲁什-库恩-塔克条件]]。

==對偶問題==
一般而言「對偶問題」是指「拉格朗日對偶問題」（Lagrangian dual problem），不過也有其他的對偶問題，例如{{le|Wolfe對偶問題|Wolfe dual problem}}和{{le|Fenchel對偶定理|Fenchel's duality theorem|Fenchel對偶問題}}。拉格朗日對偶問題是指在最小化問題上加上[[拉格朗日乘数]]，也就是在目標函數上加上對應限制條件的非負[[拉格朗日乘数]]，再求解可將原目標函數最小的原始變數值。此解會得到以拉格朗日乘数的函數表示的原始變數，稱為是「對偶變數」（dual variables），因此，新的問題就是要衍生對偶變數的限制下（包括非負的限制條件），找到可以使目標函數最大化的對偶變數。

一般而言，給定二個{{le|對偶對|dual pair}}的[[豪斯多夫空间|分隔]]{{le|局部凸空間|locally convex space}} <math>\left(X,X^*\right)</math>及<math>\left(Y,Y^*\right)</math>，以及函數<math>f: X \to \mathbb{R} \cup \{+\infty\}</math>，可以定義原始問題為找到<math>\hat{x}</math>使得<math>f(\hat{x}) = \inf_{x \in X} f(x). \,</math>
換句話說，若<math>\hat{x}</math>存在，<math>f(\hat{x})</math>是函數<math>f</math>的[[极值|最小极值]]（minimum），也可以得到函數的[[最大下界]]（infimum）。

若有限制條件，可以整合到函數<math>f</math>中，方式是令<math>f = f + I_\text{constraints}</math>，其中<math>I_{\mathrm{constraints}}</math>是在<math>X</math>上的適當函數，在限制條件上有最小值0，可以證明<math>\inf_{x \in X} \tilde{f}(x) = \inf_{x \ \mathrm{constrained}} f(x)</math>。後者的條件很明顯，但不一定很方便可以符合[[示性函数 (凸分析)|示性函数]]（也就是說，滿足限制條件的<math> x </math>，<math>I_{\mathrm{constraints}}(x) = 0 </math>，在其他情形，<math>I_{\mathrm{constraints}}(x) = \infty </math>）。因此可以將<math>\tilde{f}</math>延伸為[[擾動函數]]<math>F: X \times Y \to \mathbb{R} \cup \{+\infty\}</math>使得<math>F(x,0) = \tilde{f}(x)</math><ref name="BWG">{{cite book |title=Duality in Vector Optimization |author1=Boţ, Radu Ioan |author2=Wanka, Gert|author3=Grad, Sorin-Mihai |year=2009 |publisher=Springer |isbn=978-3-642-02885-4 }}</ref>。

[[對偶間隙]]就是不等式
:<math>\sup_{y^* \in Y^*} -F^*(0,y^*) \le \inf_{x \in X} F(x,0), \,</math>
左側和右側的差。

其中<math>F^*</math>是二個變數的[[凸共轭]]，而<math>\sup</math>是[[上确界]]<ref name="BWG" /><ref>{{cite book |title=Overcoming the failure of the classical generalized interior-point regularity conditions in convex optimization. Applications of the duality theory to enlargements of maximal monotone operators |author=Csetnek, Ernö Robert |year=2010 |publisher=Logos Verlag Berlin GmbH |isbn=978-3-8325-2503-3 }}</ref><ref name="Zalinescu">{{cite book |last=Zălinescu |first=Constantin |title=Convex analysis in general vector spaces |url=https://archive.org/details/convexanalysisge00zali_934 |url-access=limited |publisher=World Scientific Publishing&nbsp;Co.,&nbsp;Inc. |pages=[https://archive.org/details/convexanalysisge00zali_934/page/n126 106]–113 |isbn=981-238-067-1 |mr=1921556 |issue=J |year=2002 |location=River Edge, NJ }}</ref>

== 線性規劃 ==
{{Main|對偶線性規劃}}
[[线性规划]]問題是指[[损失函数]]和[[約束 (數學)|約束]]都是[[線性關係]]的[[最优化]]問題。原始問題中，目標函數是''n''個變數的線性組合，問題有''m''個約束，每一個都有上限，上限由''n''個變數的線性組合表示。其目的是要在滿足限制條件的情形下，最大化目標函數的值。其解是由''n''個值表示的向量，可以讓目標函數有最大值。

在對偶問題中，目標函數是''m''個值的線性組合，這些是原始問題中''m''個限制條件的上下限。有''n''個對偶限制條件，每一個的下限都是由''m''個對偶變數的線性組合所表示。
=== 原始問題和對偶問題的關係===
針對線性的最佳化問題，在原始問題中每一個符合限制條件的次佳點，都有一個方向（或方向的[[线性子空间|子空间]]），可以往目標函數增加的方向移動。若往這類的方向移動，稱為除去{{le|候選解|candidate solution}}和一個或多個限制之間的鬆弛量（slack）。候選解中不可行的值即為超過一個或多個限制的值。

在對偶問題中，會將對偶向量和限制條件相乘，這些限制條件會決定原始問題中限制條件的位置。在對偶問題中變動對偶向量類似在原始問題中調整上限。要找到最小的上限。也就是說，要將對偶向量最小化，以移除限制的候選位置和實際最佳解之間的鬆弛量（slack）。對偶向量中的不可行值是指哪些太低的值。這會讓候選解的一個或多個限制條件落在排除實際最佳解的位置。

在[[线性规划]]中探討對偶性的方程式中，會對上述直覺有型式化的敘述。

== 非線性規劃 ==
[[非线性规划]]中的限制條件不一定是線性的，不過許多线性规划的原則還是適用。

為了確保可以簡單的識別非线性规划中的全域最大值，問題一般會要求函數要是凸函數，而且有緊緻的lower level sets。

由此可以看出[[卡鲁什-库恩-塔克条件]]的重要性。卡鲁什-库恩-塔克条件可以提供非線性規劃問題中識別局部最佳值的必要條件。也有一些額外的必要條件（約束規範，constraint qualifications）可以用來判斷可能有「最佳解」（是局部最佳解，但不一定是全域最佳解）的方式。
===強拉格朗日原則：拉格朗日對偶{{anchor|拉格朗日對偶}}===
<!-- linked from redirect [[Lagrange duality]] -->
考慮以下形式的非線性規劃：

:<math>\begin{align}
\text{minimize }    &f_0(x) \\
\text{subject to } &f_i(x) \leq 0,\ i \in \left \{1,\ldots,m \right \} \\
                    &h_i(x) = 0,\ i \in \left \{1,\ldots,p \right \}
\end{align}</math>

其定義域<math>\mathcal{D} \subset \mathbb{R}^n</math>有非空的內部。其拉格朗日函數<math>\Lambda: \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R}^p \to \mathbb{R} </math>定義為

: <math>\Lambda(x,\lambda,\nu) = f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p \nu_i h_i(x).</math>

向量<math>\lambda</math>和<math>\nu</math>是有關此問題的「對偶變數」（dual variables）或拉格朗日乘數向量（Lagrange multiplier vectors）。拉格朗日對偶函數（Lagrange dual function）<math>g:\mathbb{R}^m \times \mathbb{R}^p \to \mathbb{R} </math>定義如下

: <math>g(\lambda,\nu) = \inf_{x\in\mathcal{D}} \Lambda(x,\lambda,\nu) = \inf_{x\in\mathcal{D}} \left ( f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p \nu_i h_i(x) \right ).</math>

對偶函數''g''是凹函數，就算初始問題不是凸也會如此，因為是仿射函數的點狀最大下界。對偶函數提供初始問題最佳值<math>p^*</math>的下界：針對任意<math>\lambda \geq 0 </math>及任意<math>\nu</math>，可以得到<math>g(\lambda,\nu) \leq p^* </math>。

若滿足約束規範（例如{{le|斯萊特條件|Slater's condition}}），且原問題是凸，則可以得到{{le|強對偶|strong duality}}<math>d^* = \max_{\lambda \ge 0, \nu} g(\lambda,\nu) = \inf f_0 = p^*</math>。
=== 凸問題 ===
針對有不等式限制的凸最小化問題
: <math>\begin{align}
&\underset{x}{\operatorname{minimize}}& & f(x) \\
&\operatorname{subject\;to}
& &g_i(x) \leq 0, \quad i = 1,\ldots,m
\end{align}</math>
拉格朗日對偶問題為
: <math>\begin{align}
&\underset{u}{\operatorname{maximize}}& & \inf_x \left(f(x) + \sum_{j=1}^m u_j g_j(x)\right) \\
&\operatorname{subject\;to}
& &u_i \geq 0, \quad i = 1,\ldots,m
\end{align}</math>
其中目標函數是拉格朗日對偶函數。假設函數<math>f</math> and <math>g_1, \ldots, g_m</math> 連續可微，最大下界出現在梯度等於零的位置。問題  
: <math>\begin{align}
&\underset{x, u}{\operatorname{maximize}}& & f(x) + \sum_{j=1}^m u_j g_j(x) \\
&\operatorname{subject\;to}
& & \nabla f(x) + \sum_{j=1}^m u_j \, \nabla g_j(x) = 0 \\
&&&u_i \geq 0, \quad i = 1,\ldots,m
\end{align}</math>
稱為Wolfe對偶問題。此問題用計算的方式處理可能會很困難，因為目標函數在聯合變數<math>(u,x)</math>上不是凹。而且，一般而言，等式的限制條件<math>\nabla f(x) + \sum_{j=1}^m u_j \, \nabla g_j(x)</math>也是非線性的，因此Wolfe對偶問題一般而言會不會是凸最佳化問題。不論如何，{{le|弱對偶|weak duality}}會成立<ref>{{cite journal |last1=Geoffrion |first1=Arthur M. |title=Duality in Nonlinear Programming: A Simplified Applications-Oriented Development | jstor=2028848 |journal=SIAM Review |volume=13 |year=1971 |pages=1–37 |issue=1 |doi=10.1137/1013001}}</ref>。

== 歷史 ==
依照[[喬治·伯納德·丹齊格]]所述，在丹齊格提出了線性規劃問題後，[[约翰·冯·诺伊曼]]就提出了線性規劃的對偶性定理的猜想。冯·诺伊曼注意到他使用了來自其[[博弈论]]中的資訊，並且猜想兩人零和的矩陣博弈和線性規劃等效。嚴謹的證明最早是由[[阿尔伯特·塔克]]和其團隊發表（丹齊格在Nering和塔克1993年著作中的前言）

==相關條目==
* [[凸共轭]]
* [[对偶 (数学)]]
* {{le|Relaxation (近似)|Relaxation (approximation)}}

==腳註==
{{Reflist}}
== 參考資料 ==
===書籍===
* {{cite book |author1=Ahuja, Ravindra K. |author2=Magnanti, Thomas L.  |author3=Orlin, James B. |title=Network Flows: Theory, Algorithms and Applications |url=https://archive.org/details/networkflowstheo0000ahuj |publisher=Prentice Hall |year=1993 |isbn=0-13-617549-X }}
* {{cite book|title=Convex Analysis and Optimization|last1=Bertsekas|first1=Dimitri|last2=Nedic|first2=Angelia|last3=Ozdaglar|first3=Asuman|year=2003|publisher=Athena Scientific|isbn=1-886529-45-0}}
* {{cite book |last1=Bertsekas |first1=Dimitri P. |year=1999 |title=Nonlinear Programming |edition=2nd |publisher=Athena Scientific |isbn=1-886529-00-0 }}
* {{cite book |last1=Bertsekas |first1=Dimitri P. |year=2009 |title=Convex Optimization Theory |publisher=Athena Scientific |isbn=978-1-886529-31-1 }}
* {{cite book |last1=Bonnans |first1=J.&nbsp;Frédéric |last2=Gilbert |first2=J.&nbsp;Charles |last3=Lemaréchal |first3=Claude |last4=Sagastizábal |first4=Claudia&nbsp;A. |title=Numerical optimization: Theoretical and practical aspects |url=https://www.springer.com/mathematics/applications/book/978-3-540-35445-1 |edition=Second revised ed. of translation of 1997 <!-- ''Optimisation numérique : Aspects théoriques et pratiques'' -->French |series=Universitext |publisher=Springer-Verlag |location=Berlin |year=2006 |pages=xiv+490 |isbn=3-540-35445-X |doi=10.1007/978-3-540-35447-5 |mr=2265882 |access-date=2021-05-15 |archive-date=2013-07-19 |archive-url=https://web.archive.org/web/20130719050442/http://www.springer.com/mathematics/applications/book/978-3-540-35445-1 |dead-url=no }}
*{{cite book |first1=William J. |last1=Cook |first2=William H. |last2=Cunningham |first3=William R. |last3=Pulleyblank |first4=Alexander |last4=Schrijver |title=Combinatorial Optimization |publisher=John Wiley & Sons |edition=1st |date=November 12, 1997 |isbn =0-471-55894-X }}
* {{cite book |last1=Dantzig |first1=George B. | year= 1963 |title=Linear Programming and Extensions |url=https://archive.org/details/linearprogrammin00dant |url-access=registration |publisher=Princeton University Press |location=Princeton, NJ }}
* {{cite book |last1=Hiriart-Urruty |first1=Jean-Baptiste |last2=Lemaréchal |first2=Claude |title=Convex analysis and minimization algorithms, Volume&nbsp;I: Fundamentals |series=Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] |volume=305 |publisher=Springer-Verlag |location=Berlin |year=1993 |pages=xviii+417 |isbn=3-540-56850-6 |mr=1261420 }}
*{{cite book |last1=Hiriart-Urruty |first1=Jean-Baptiste |last2=Lemaréchal |first2=Claude |chapter=14 Duality for Practitioners |title=Convex analysis and minimization algorithms, Volume&nbsp;II: Advanced theory and bundle methods |series=Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] |volume=306 |publisher=Springer-Verlag |location=Berlin |year=1993 |pages=xviii+346 |isbn=3-540-56852-2 |mr=1295240 }}
*{{cite book |last=Lasdon |first=Leon&nbsp;S. |title=Optimization theory for large systems |url=https://archive.org/details/optimizationtheo0000lasd |publisher=Dover Publications, Inc. |location=Mineola, New York |year=2002 |orig-year=Reprint of the 1970 Macmillan |pages=xiii+523 |mr=1888251 |isbn=978-0-486-41999-2 }}
* {{cite book |last=Lawler |first=Eugene |title=Combinatorial Optimization: Networks and Matroids |chapter=4.5. Combinatorial Implications of Max-Flow Min-Cut Theorem, 4.6. Linear Programming Interpretation of Max-Flow Min-Cut Theorem |year=2001 |publisher=Dover |isbn=0-486-41453-1 |pages=117–120 }}
* {{cite book |last=Lemaréchal |first=Claude |chapter=Lagrangian relaxation |pages=112–156 |title=Computational combinatorial optimization: Papers from the Spring School held in Schloß Dagstuhl, May&nbsp;15–19,&nbsp;2000 |editor=Jünger, Michael |editor2=Naddef, Denis |series=Lecture Notes in Computer Science (LNCS) |volume=2241 |publisher=Springer-Verlag |location=Berlin |year=2001 |isbn=3-540-42877-1 |mr=1900016 |doi=10.1007/3-540-45586-8_4 }}
* {{cite book |last=Minoux |first=Michel |title=Mathematical programming: Theory and algorithms |url=https://archive.org/details/mathematicalprog0000mmin |others= Egon Balas (forward); Steven Vajda (trans) from the (1983 Paris: Dunod) French |publisher=A Wiley-Interscience Publication. John Wiley & Sons, Ltd. |location=Chichester |year=1986 |pages=xxviii+489 |isbn=0-471-90170-9 |mr=868279 |id=(2008 Second ed., in French: ''Programmation mathématique : Théorie et algorithmes'', Éditions Tec & Doc, Paris, 2008. xxx+711 pp.  )) }}
* {{cite book |last1=Nering |first1=Evar D. |last2= Tucker |first2=Albert W. |year=1993 |title=Linear Programming and Related Problems |url=https://archive.org/details/linearprogramsre0000neri |url-access=registration |publisher=Academic Press |location=Boston, MA |isbn=978-0-12-515440-6 }}
*{{cite book |first1=Christos H. |last1=Papadimitriou |first2=Kenneth |last2=Steiglitz |title=Combinatorial Optimization : Algorithms and Complexity |publisher=Dover |edition=Unabridged |date=July 1998  |isbn=0-486-40258-4 }}
* {{cite book|last=Ruszczyński|first=Andrzej|title=Nonlinear Optimization|publisher=<!-- [[Princeton University Press]] -->[[普林斯頓大學出版社]]|location=Princeton, NJ|year=2006|pages=xii+454|isbn=978-0691119151 |mr=2199043}}

===論文===
* {{cite journal |first=Hugh, III |last=Everett |url=http://or.journal.informs.org/cgi/reprint/11/3/399 |title=Generalized Lagrange multiplier method for solving problems of optimum allocation of resources |doi=10.1287/opre.11.3.399 |journal=Operations Research |volume=11 |year=1963 |pages=399–417 |mr=152360 |jstor=168028 |issue=3 |url-status=yes |archive-url=https://web.archive.org/web/20110724151508/http://or.journal.informs.org/cgi/reprint/11/3/399 |archive-date=2011-07-24 |access-date=2021-05-15 }}
* {{cite journal |last1=Kiwiel |first1=Krzysztof&nbsp;C. |last2=Larsson |first2=Torbjörn |last3=Lindberg |first3=P.&nbsp;O. |title=Lagrangian relaxation via ballstep subgradient methods |mr=2348241 |journal=Mathematics of Operations Research |volume=32 |date=August 2007 |url=http://mor.journal.informs.org/cgi/content/abstract/32/3/669 |issue=3 |pages=669–686 |doi=10.1287/moor.1070.0261 |access-date=2021-05-15 |archive-date=2011-07-26 |archive-url=https://web.archive.org/web/20110726180807/http://mor.journal.informs.org/cgi/content/abstract/32/3/669 |dead-url=yes }}
* [http://www.civilized.com/files/duality.pdf Duality in Linear Programming] {{Wayback|url=http://www.civilized.com/files/duality.pdf |date=20210423190331 }} Gary D. Knott

[[Category:線性規劃]]
[[Category:凸最佳化]]