查看“︁梅尔刻度”︁的源代码
←
梅尔刻度
跳转到导航
跳转到搜索
因为以下原因,您没有权限编辑该页面:
您请求的操作仅限属于该用户组的用户执行:
用户
您可以查看和复制此页面的源代码。
{{cleanup-jargon|time=2013-04-26T10:06:09+00:00}} [[Image:Mel-Hz_plot.svg|right|thumb|300px|梅尔与赫兹的对应图]] [[Image:A440.png|thumb|A440 {{audio|A440.mid|Play}}. 440 Hz = 549.64 mels]] '''梅尔刻度'''(又稱'''Mel尺度''',{{lang-en|Mel scale}})是一種基于[[頻率 (物理學)|頻率]]定义的非線性刻度单位,表示人耳对[[音高]](pitch)等距變化的感官,由{{Link-en|Stanley Smith Stevens|Stanley Smith Stevens|Stevens}}、{{Link-en|John Volkman|John Volkman|Volkman}} 和Newman于1937年命名。<ref name=stevens1937> {{cite journal |journal = Journal of the Acoustical Society of America |title = A scale for the measurement of the psychological magnitude pitch |author = Stevens, Stanley Smith; Volkman; John; & Newman, Edwin B. |volume = 8 |issue = 3 |publisher = |pages = 185–190 |issn = |year = 1937 |url = http://asadl.org/jasa/resource/1/jasman/v8/i3/p185_s1 |deadurl = yes |archiveurl = https://archive.today/20130414065947/http://asadl.org/jasa/resource/1/jasman/v8/i3/p185_s1 |archivedate = 2013-04-14 }}</ref> 梅爾刻度與線性的頻率刻度赫茲(Hz)之間可以進行近似的數學換算。一个常用的将<math>f</math>赫兹转换为<math>m</math>梅尔的公式是:<ref>{{cite book | title = Speech communication: human and machine | author = Douglas O'Shaughnessy | publisher = Addison-Wesley | year = 1987 | isbn = 978-0-201-16520-3 | page = 150 | url = http://books.google.com/books?ei=rhFfSpa-BJOCNsTT4IUG&id=mHFQAAAAMAAJ&dq=Speech+Communications:+Human+and+Machine&q=2595#search_anchor | access-date = 2013-04-26 | archive-date = 2015-03-19 | archive-url = https://web.archive.org/web/20150319154042/http://books.google.com/books?ei=rhFfSpa-BJOCNsTT4IUG&id=mHFQAAAAMAAJ&dq=Speech+Communications:+Human+and+Machine&q=2595#search_anchor | dead-url = no }}</ref> :<math>m = 2595 \log_{10}\left(1 + \frac{f}{700}\right)</math> 梅尔刻度將1000Hz,且高于人耳[[听阈]]值40[[分贝]]的聲音信號,定為1000mel的参考点。在頻率500Hz以上时,随着频率的增加,人耳每感覺到等量的音高變化,所需要的頻率變化愈來愈大。这导致在赫茲刻度500Hz往上的四个[[八度]](一個八度即為兩倍的頻率),只对应梅尔刻度上的两个[[八度]]。'''Mel'''的名字来源于单词melody,表示这个刻度是基於音高比较而被創造的。 ==历史和其他公式== 历史上,存在过各种各样的转换公式。<ref> {{cite book | title = Foundations of Modern Auditory Theory | editor = Jerry V. Tobias | publisher = Academic Press | volume = 1 | author = W. Dixon Ward | chapter = Musical Perception | year = 1970 | page = 412 | quote = no one claims yet to have determined 'the' mel scale.}}</ref> 在O'Shaugnessy的书中的常用公式选用不同的对数底可以有不同的表达式: :<math>m = 2595 \log_{10}\left(1 + \frac{f}{700}\right) = 1127 \log_e\left(1 + \frac{f}{700}\right) \ </math> 对应的逆变换公式是: :<math>f = 700(10^{m/2595} - 1) = 700(e^{m/1127} - 1) \ </math> 自从Steinberg于1937年出版的基于[[最小可覺差]]音高的刻度曲线和表格<ref> {{cite journal | journal = Journal of the Acoustical Society of America | title = Positions of stimulation in the cochlea by pure tones | author = John C. Steinberg | volume = 8 | issue = 3 | publisher = | pages = 176–180 | issn = | year = 1937 | url = http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=JASMAN000008000003000176000001 }}</ref> 后,还有许多其他曲线通过不同的实验方法和分析途径被提出,如Fletcher和Munson在1937年<ref> {{cite journal | journal = Journal of the Acoustical Society of America | title = Relation Between Loudness and Masking | url = https://archive.org/details/sim_journal-of-the-acoustical-society-of-america_1937-07_9_1/page/1 | author = Harvey Fletcher and W. A. Munson | volume = 9 | pages = 1–10 | year = 1937 }}</ref> ,Fletcher在1938年<ref> {{cite journal | journal = Journal of the Acoustical Society of America | title = Loudness, Masking and Their Relation to the Hearing Process and the Problem of Noise Measurement | author = Harvey Fletcher | volume = 9 | pages = 275–293 | year = 1938 | issue = 4 | url = http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=JASMAN000009000004000275000001 }}</ref> ,Steven于1937年<ref name=stevens1937/> 以及 Stevens 和 Volkmann于1940年<ref> {{cite journal | journal = American Journal of Psychology | title = The Relation of Pitch to Frequency: A Revised Scale | author = Stevens, S., and Volkmann, J. | volume = 53 | issue = 3 | pages = 329–353 | year = 1940 | url = http://www.jstor.org/pss/1417526 }}</ref> 分别给出的曲线。 在1949年,Koenig发表了一个基于独立的线性部分和对数部分的近似值,取1000Hz作为两个部分的分界点。<ref> {{cite journal | journal = Bell Telephone Laboratory Record | title = A new frequency scale for acoustic measurements | author = W. Koenig | volume = 27 | pages = 299–301 | year = 1949 }}</ref> Gunnar Fant于1949年发表了当前流行的线性\对数公式,但是有1000Hz的[[截止频率]](corner frequency)。<ref> Gunnar Fant (1949) "Analys av de svenska konsonantljuden : talets allmänna svängningsstruktur", LM Ericsson protokoll H/P 1064 </ref> Fant于1968年发表了该公式的另一种与对数的[[底数(指数)|底数]]的选择无关的形式:<ref>Fant, Gunnar. (1968). Analysis and synthesis of speech processes. In B. Malmberg (Ed.), ''Manual of phonetics'' (pp. 173-177). Amsterdam: North-Holland.</ref><ref>{{cite book | title = Techniques in speech acoustics | author = Jonathan Harrington and Steve Cassidy | publisher = Springer | year = 1999 | isbn = 978-0-7923-5731-5 | page = 18 | url = http://books.google.com/books?id=E1SyZZN8WQkC&pg=PA18 | access-date = 2013-04-26 | archive-date = 2015-03-19 | archive-url = https://web.archive.org/web/20150319142314/http://books.google.com/books?id=E1SyZZN8WQkC&pg=PA18 | dead-url = no }}</ref> :<math>m = \frac{1000}{\log(2)} \log \left(1 + \frac{f}{1000}\right) \ </math> 1976年,Makhoul与Cosell发表了现在流行的版本,截止频率取为700Hz。<ref>{{citation | work = ICASSP 1976 | title = LPCW: An LPC vocoder with linear predictive spectral warping | author = John Makhoul and Lynn Cosell | volume = 1 | publisher = IEEE | pages = 466–469 | year = 1976 | url = http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1170013 | accessdate = 2013-04-26 | archive-date = 2013-07-31 | archive-url = https://web.archive.org/web/20130731004330/http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1170013 | dead-url = no }}</ref> Ganchev等人指出:"相比于Fant等人的1000Hz的公式,700Hz的公式能够在1000Hz以下更近似于Mel刻度,代价是超过1000Hz时误差更大。"<ref>{{citation | work = Proceedings of the SPECOM-2005 | title = Comparative evaluation of various MFCC implementations on the speaker verification task, | author = T. Ganchev, N. Fakotakis, and G. Kokkinakis | pages = 191–194 | year = 2005 | url = http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.8303 | accessdate = 2013-04-26 | archive-date = 2012-10-15 | archive-url = https://web.archive.org/web/20121015210511/http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.8303 | dead-url = no }}</ref> 但是当频率超过7kHz时,700Hz的版本表现的更好。 这些公式的数据由Beranek于1949年基于Stevens 和 Volkman的曲线被制作成表格:<ref>Beranek, Leo L. (1949). ''Acoustic measurements''. New York: McGraw-Hill.</ref> {{Table |type = class = "wikitable" |title = Beranek 等制表(1949),数据来源于Stevens 和 Volkman(1940) | row1 = '''Hz''' {{!!}} 20 {{!!}} 160 {{!!}} 394 {{!!}} 670 {{!!}} 1000 {{!!}} 1420 {{!!}} 1900 {{!!}} 2450 {{!!}} 3120 {{!!}} 4000 {{!!}} 5100 {{!!}} 6600 {{!!}} 9000 {{!!}} 14000 | row2 = '''mel''' {{!!}} 0 {{!!}} 250 {{!!}} 500 {{!!}} 750 {{!!}} 1000 {{!!}} 1250 {{!!}} 1500 {{!!}} 1750 {{!!}} 2000 {{!!}} 2250 {{!!}} 2500 {{!!}} 2750 {{!!}} 3000 {{!!}} 3250 }} 具有625Hz截断频率的公式由Lindsay和Norman于1977年在《Human information processing: An introduction to psychology》中提出,<ref>Lindsay, Peter H.; & Norman, Donald A. (1977). ''Human information processing: An introduction to psychology'' (2nd ed.). New York: Academic Press.</ref> 但在该书1972年第一版中该公式没有出现: :<math>m = 2410 \log_{10}(1.6\times10^{-3} f + 1)</math> 大多数的公式能够保证1000 mel对应1000Hz。截断频率(break frequency),如700Hz、1000Hz或625Hz,是这些公式中唯一的自由参数。一些非MEL听觉频率尺度(auditory-frequency-scale)公式使用了相同的形式,但截断频率低得多,不一定能保障1000mel对应1000Hz,例如1990年Glasberg与Moore提出的[[Equivalent rectangular bandwidth|ERB-rate]]刻度使用的是228.8Hz<ref>B.C.J. Moore and B.R. Glasberg, "Suggested formulae for calculating auditory-filter bandwidths and excitation patterns" Journal of the Acoustical Society of America 74: 750-753, 1983.</ref> ,1990年Greenwood的“cochlear frequency–place map”则使用165.3Hz作为截断频率。<ref>Greenwood, D. D. (1990). A cochlear frequency–position function for several species—29 years later. ''The Journal of the Acoustical Society of America'', 87, 2592–2605.</ref> Umesh等人对其他形式的梅尔刻度进行了研究。根据从这些曲线上计算的数据,他们指出,传统的含有对数区域和线性区域的公式,以及其他形式的公式,都不符合Stevens和Volkman的曲线:<ref> {{citation | journal = Proc. ICASSP 1999 | title = Fitting the mel scale | author = Umesh, S. and Cohen, L. and Nelson, D. | publisher = IEEE | pages = 217–220 | isbn = 0-7803-5041-3 | year = 1999 | url = }}</ref> {{Table |type = class = "wikitable" |title = Umesh 等制表(1999),数据来源于Stevens 和 Volkman(1940) | row1 = '''Hz''' {{!!}} 40 {{!!}} 161 {{!!}} 200 {{!!}} 404 {{!!}} 693 {{!!}} 867 {{!!}} 1000 {{!!}} 2022 {{!!}} 3000 {{!!}} 3393 {{!!}} 4109 {{!!}} 5526 {{!!}} 6500 {{!!}} 7743 {{!!}} 12000 | row2 = '''mel''' {{!!}} 43 {{!!}} 257 {{!!}} 300 {{!!}} 514 {{!!}} 771 {{!!}} 928 {{!!}} 1000 {{!!}} 1542 {{!!}} 2000 {{!!}} 2142 {{!!}} 2314 {{!!}} 2600 {{!!}} 2771 {{!!}} 2914 {{!!}} 3228 }} ==参考文献== {{reflist}} ==外部链接== *[http://users.utu.fi/jyrtuoma/speech/Mel2Hz.html Hz–mel, mel–Hz conversion] {{Wayback|url=http://users.utu.fi/jyrtuoma/speech/Mel2Hz.html |date=20121125042523 }} (uses the O'Shaughnessy equation) *[http://scitation.aip.org/dbt/dbt.jsp?KEY=JASMAN&Volume=8&Issue=3 J. Acoust. Soc. Am. table of contents] for Stevens et al. paper *[http://www.sfu.ca/sonic-studio/handbook/Mel.html Handbook for Acoustic Ecology] {{Wayback|url=http://www.sfu.ca/sonic-studio/handbook/Mel.html |date=20130312043725 }} ==参见== *[[巴克刻度]] *[[梅尔频率倒谱系数]] *{{Link-en|Fletcher–Munson curves|Fletcher–Munson curves}} {{Acoustics}} [[Category:刻度]] [[Category:心理声学]]
该页面使用的模板:
Template:Acoustics
(
查看源代码
)
Template:Audio
(
查看源代码
)
Template:Citation
(
查看源代码
)
Template:Cite book
(
查看源代码
)
Template:Cite journal
(
查看源代码
)
Template:Cleanup-jargon
(
查看源代码
)
Template:Lang-en
(
查看源代码
)
Template:Link-en
(
查看源代码
)
Template:Reflist
(
查看源代码
)
Template:Table
(
查看源代码
)
Template:Wayback
(
查看源代码
)
返回
梅尔刻度
。
导航菜单
个人工具
登录
命名空间
页面
讨论
不转换
查看
阅读
查看源代码
查看历史
更多
搜索
导航
首页
最近更改
随机页面
MediaWiki帮助
特殊页面
工具
链入页面
相关更改
页面信息