查看“︁平滑最大值”︁的源代码


'''平滑最大值'''是最大值函数<math>\max(x_1,\ldots,x_n),</math>的[[光滑函数]]。其是一个[[参数族]]，在 <math>m_\alpha(x_1,\ldots,x_n)</math>中，对于每个参数{{Mvar|''α''}}，函数 {{tmath|m_\alpha}} 都是平滑的。参数族内包含最大值函数，并且{{tmath|m_\alpha \to \max}} 当 {{tmath|\alpha\to\infty}}。 '''平滑最小值'''的概念也是类似的。 在大多数情况下，一个族满足两个条件：当参数趋向于正无穷大时为函数变为最大值函数，当参数变为负无穷大时函数变为最小值函数；符号表示为： {{tmath|m_\alpha \to \max}}  当 {{tmath|\alpha \to \infty}} ，{{tmath|m_\alpha \to \min}} 当 {{tmath|\alpha \to -\infty}}。平滑最大值也可以用于描述行为类似于最大值函数的其他平滑函数，而不一定必须在此参数族中。 

== 例子 ==
[[File:Smoothmax.png|缩略图| 平滑最大值应用于具有各种系数的'-x'和x函数。 非常光滑当<math>\alpha</math> = 0.5，而<math>\alpha</math> = 8更加平滑。 ]]
当正值参数较大时，且<math>\alpha > 0</math> ，下列公式是最大函数的平滑函数，可微、近似于最大值函数。 对于绝对值较大的负值参数，其近似最小值函数。 

: <math>
\mathcal{S}_\alpha (x_1,\ldots,x_n) = \frac{\sum_{i=1}^n x_i e^{\alpha x_i}}{\sum_{i=1}^n e^{\alpha x_i}}
</math>

<math>\mathcal{S}_\alpha</math>具有以下属性： 

# <math>\mathcal{S}_\alpha\to \max</math>当<math>\alpha\to\infty</math> 
# <math>\mathcal{S}_0</math>是其输入的[[算术平均数|算术平均值]] 
# <math>\mathcal{S}_\alpha\to \min</math>当<math>\alpha\to -\infty</math>   

<math>\mathcal{S}_{\alpha}</math>的梯度近似于[[Softmax函数|softmax]]函数，由以下公式可得：

: <math>
\nabla_{x_i}\mathcal{S}_\alpha (x_1,\ldots,x_n) = \frac{e^{\alpha x_i}}{\sum_{j=1}^n e^{\alpha x_j}} [1 + \alpha(x_i - \mathcal{S}_\alpha (x_1,\ldots,x_n))].
</math>

这使softmax函数使用[[梯度下降法|梯度下降的]]优化时很有用。   

=== LogSumExp ===
{{Main|LogSumExp}}

另一个平滑最大值函数例子是[[ LogSumExp |LogSumExp]] ： 

: <math>
\mathrm{LSE}(x_1, \ldots, x_n) = \log( \exp(x_1) + \ldots + \exp(x_n))
</math>

如果<math>x_i</math>都是非负的，可产生定义域是<math>[0,\infty)^n</math>和值域是<math>[0, \infty)</math>的函数 ： 

: <math>
g(x_1, \ldots, x_n) = \log( \exp(x_1) + \ldots + \exp(x_n) - (n - 1) )
</math>

<math>(n - 1)</math>项通过消除除零以外的所有零指数使得<math>\exp(0) = 1</math>，以及<math>\log 1 = 0</math>当<math>x_i</math>为零。

=== p范数函数 ===
另一个平滑最大值函数是[[Lp空间|p范数]] ： 

: <math>
|| (x_1, \ldots, x_n) ||_p = \left( |x_1|^p + \cdots + |x_n|^p \right)^{1/p}
</math>

当<math>p \to \infty</math> ，收敛到<math>|| (x_1, \ldots, x_n) ||_\infty = \max_{1\leq i\leq n} |x_i| </math>。 

p范数的一个优点是它是一个[[范数]] 。 因此，它是“尺度不变”的（同质的）： <math>|| (\lambda x_1, \ldots, \lambda x_n) ||_p = |\lambda| \times || (x_1, \ldots, x_n) ||_p </math> ，它满足三角不等式。 

== 数值方法 ==

== 平滑函数的其他例子 ==

: <math>
\mathcal{max}_\alpha (x_1,x_2) = 0.5 \left( (x_1+x_2) + \sqrt{ (x_1-x_2)^2 + \alpha } \right)
</math>

== 参见 ==

* [[ LogSumExp |LogSumExp]] 
* [[Softmax函数]] 
* [[幂平均|广义均值]] 

== 参考文献 ==
{{Reflist}} 
M. Lange, D. Zühlke, O. Holz, and T. Villmann, "Applications of lp-norms and their smooth approximations for gradient based learning vector quantization," in Proc. ESANN, Apr. 2014, pp. 271-276. (https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-153.pdf {{Wayback|url=https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-153.pdf |date=20191017122812 }})
 
[[Category:集合論基本概念]] [[Category:數學表示法]]