Home >Backend Development >PHP Tutorial >PHP implements text-based Morse code generator_PHP tutorial
Introduction
I recently encountered a need to generate Morse code audio files based on input text. After a few fruitless searches, I decided to write my own generator.
Download source code – 2.63 KB
Because I want to access my Morse code audio files via the web, I decided to use PHP as my main programming language. The screenshot above shows a web page starting to generate Morse code. The downloaded zip file contains a web page for submitting text and a PHP source file for generating and displaying audio files. If you want to test PHP code, you need to copy the web page and related PHP files to a PHP-enabled server.
For many people, Morse Code is just a sequence of "dots" and "dashes" or a series of beeps as shown in some old movies. Obviously, this understanding is not enough if you want to use computer code to generate Morse code. This article will introduce the elements of generating Morse code, how to generate audio files in WAVE format, and how to use PHP to convert Morse code into audio files.
Morse Code
Morse code is a text encoding method. Its advantage is that it is easy to encode and can be easily decoded by the human ear. Essentially, audio or radio frequencies are turned on and off to create short or long audio pulses, commonly known as dots and dashes, or in radio terminology, clicks and clicks. In modern digital communications terms, Morse code is a type of amplitude shift keying (ASK).
In Morse code, the characters letters, numbers, punctuation marks and special symbols) are encoded into a sequence of "ticks" and "dahs". So in order to convert text into Morse code, we first need to determine how to represent "tick" and "dah". An obvious choice is to use 0 for "tick" and 1 for "dah", or vice versa. Unfortunately, Morse code uses a variable-length encoding scheme. Therefore, we must also use a variable-length sequence, or adopt a method to pack the data into a fixed bit-size (fixed bit-size) format common to computer memory. In addition, it is important to note that Morse code does not distinguish between upper and lower case letters, and cannot encode some special symbols. In our implementation, undefined characters and symbols are ignored.
In this project, memory usage is not a special consideration. Therefore, we propose a simple encoding scheme, that is, use "0" to represent each "tick" and "1" to represent each "dah", and put them in a string associative array. The PHP code that defines the Morse code encoding table is as follows:
<ol class="dp-j"><li class="alt"><span><span>$CWCODE = array (</span><span class="string">'A'</span><span>=></span><span class="string">'01'</span><span>,</span><span class="string">'B'</span><span>=></span><span class="string">'1000'</span><span>,</span><span class="string">'C'</span><span>=></span><span class="string">'1010'</span><span>,</span><span class="string">'D'</span><span>=></span><span class="string">'100'</span><span>,</span><span class="string">'E'</span><span>=></span><span class="string">'0'</span><span>, </span></span></li><li><span> <span class="string">'F'</span><span>=></span><span class="string">'0010'</span><span>,</span><span class="string">'G'</span><span>=></span><span class="string">'110'</span><span>,</span><span class="string">'H'</span><span>=></span><span class="string">'0000'</span><span>,</span><span class="string">'I'</span><span>=></span><span class="string">'00'</span><span>,</span><span class="string">'J'</span><span>=></span><span class="string">'0111'</span><span>, </span></span></li><li class="alt"><span> <span class="string">'K'</span><span>=></span><span class="string">'101'</span><span>,</span><span class="string">'L'</span><span>=></span><span class="string">'0100'</span><span>,</span><span class="string">'M'</span><span>=></span><span class="string">'11'</span><span>,</span><span class="string">'N'</span><span>=></span><span class="string">'10'</span><span>, </span><span class="string">'O'</span><span>=></span><span class="string">'111'</span><span>, </span></span></li><li><span> <span class="string">'P'</span><span>=></span><span class="string">'0110'</span><span>,</span><span class="string">'Q'</span><span>=></span><span class="string">'1101'</span><span>,</span><span class="string">'R'</span><span>=></span><span class="string">'010'</span><span>,</span><span class="string">'S'</span><span>=></span><span class="string">'000'</span><span>,</span><span class="string">'T'</span><span>=></span><span class="string">'1'</span><span>, </span></span></li><li class="alt"><span> <span class="string">'U'</span><span>=></span><span class="string">'001'</span><span>,</span><span class="string">'V'</span><span>=></span><span class="string">'0001'</span><span>,</span><span class="string">'W'</span><span>=></span><span class="string">'011'</span><span>,</span><span class="string">'X'</span><span>=></span><span class="string">'1001'</span><span>,</span><span class="string">'Y'</span><span>=></span><span class="string">'1011'</span><span>, </span></span></li><li><span> <span class="string">'Z'</span><span>=></span><span class="string">'1100'</span><span>, </span><span class="string">'0'</span><span>=></span><span class="string">'11111'</span><span>,</span><span class="string">'1'</span><span>=></span><span class="string">'01111'</span><span>,</span><span class="string">'2'</span><span>=></span><span class="string">'00111'</span><span>, </span></span></li><li class="alt"><span> <span class="string">'3'</span><span>=></span><span class="string">'00011'</span><span>,</span><span class="string">'4'</span><span>=></span><span class="string">'00001'</span><span>,</span><span class="string">'5'</span><span>=></span><span class="string">'00000'</span><span>,</span><span class="string">'6'</span><span>=></span><span class="string">'10000'</span><span>, </span></span></li><li><span> <span class="string">'7'</span><span>=></span><span class="string">'11000'</span><span>,</span><span class="string">'8'</span><span>=></span><span class="string">'11100'</span><span>,</span><span class="string">'9'</span><span>=></span><span class="string">'11110'</span><span>,</span><span class="string">'.'</span><span>=></span><span class="string">'010101'</span><span>, </span></span></li><li class="alt"><span> <span class="string">','</span><span>=></span><span class="string">'110011'</span><span>,</span><span class="string">'/'</span><span>=></span><span class="string">'10010'</span><span>,</span><span class="string">'-'</span><span>=></span><span class="string">'10001'</span><span>,</span><span class="string">'~'</span><span>=></span><span class="string">'01010'</span><span>, </span></span></li><li><span> <span class="string">'?'</span><span>=></span><span class="string">'001100'</span><span>,</span><span class="string">'@'</span><span>=></span><span class="string">'00101'</span><span>); </span></span></li></ol>
It should be noted that if you are particularly concerned about memory usage, the above code can be interpreted as bit). By adding a start bit to each code, a bit pattern can be formed, and each character can be stored in one byte. At the same time, when parsing the final encoding, the bits to the left of the starting bit are deleted) to obtain a true variable-length encoding.
Although many people don’t realize it, the “time interval” is in fact the main factor that defines Morse code, so understanding this is key to generating Morse code. Therefore, the first thing we have to do is to define the internal code of Morse code (ie, the time interval between "tick" and "dah"). For the sake of convenience, we define the length of a "tick" sound as a time unit dt, and the interval between "tick" and "dah" is also a time unit dt; define the length of a "dah" as 3 dt, and the characters letters ) is also 3 dt; the definition word words) is 7 dt apart. So, to sum it up, our time interval table looks like this:
Project |
Length of time |
Di |
dt |
The interval between "tick"/"dah" |
dt |
“Ta” |
3*dt |
The space between characters |
3*dt |
Space between words |
7*dt |
在莫斯代码中,编码声音的“播放速度”通常用 单词数/分钟(WPM) 来表示。由于英文单词有不同的长度,而且字符也有不同数量的“嘀”和“嗒”,所以,从WPM转化成音频)数字采样并不是看上去那样简单。在一份被国际组 织采用的方案中,采用5个字符作为单词的平均长度,同时,一个数字或标点符号被当做2个字符。这样,平均一个单词就是50个时间单位dt。这样,如果你指 定了WPM,那么我们总的播放时间就是 50 * WPM的时间单位/分钟,每个“嘀”即一个时间单位dt)的长度等于1.2/WPM秒。这样,给出一个“嘀”的时间长度,其他元素的时间长度很容易就能 够计算出来。
你可能已经注意到,在上面显示的网页中,对于低于15WPM的选项,我们使用了“Farnsworth spacing”。那么这个“Farnsworth spacing”又是个什么鬼?
当报务员学习用耳朵来解码莫斯代码的时候,他就会意识到,当播放速度变化的时候,字符出现的节奏也会跟着变化。当播放速度低于10WPM的时候,他 能够从容的识别“嘀”和“嗒”,并且知道发送的哪个字符。但是当播放速度超过10WPM的时候,报务员的识别就会出错,他识别出来的字符会多于实际的 “嘀”和“嗒”。当一个学习的时候习惯低速莫斯代码的人,在处理高速播放代码的时候,就会出现问题。因为节奏变了,他潜意识的识别就会出错。
为了解决这个问题,“Farnsworth spacing”就被发明出来了。本质上来讲,字母和符号的播放速度依然采取高于15WPM的速度,同时,通过在字符之间插入更多的空格,来使整体的播放 速度降低。这样,报务员就能够以一个合理的速度和节奏来识别每个字符,一旦所有的字符都学习完毕,就可以增加速度,而接收员只需要加快识别字符的速度就可 以了。本质上来说,“Farnsworth spacing”这个技巧解决了节奏变化这个问题,使接收员能够快速学习。
所以,在整个系统中,对于更低的播放速度,都统一成15WPM。相对应的,一个“嘀”的长度是0.08秒,但是字符之间和单词之间的间隔就不再是3个dit或者7个dit,而是进行的调整以适应整体速度。
生成声音
在PHP代码中,一个字符即前面数组的索引)代表一组由“嘀”、“嗒”和空白间隔组成的莫斯声音。我们用数字采样来组成音频序列,并且将其写入到文件中,同时加上适当的头信息来将其定义成WAVE格式。
生成声音的代码其实相当简单,你可以在项目中PHP文件中找到它们。我发现定义一个“数字振荡器”相当方便。每调用一次osc(),它就会返回一个 从正玄波产生的定时采样。运用声音采样和声频规范,生成WAVE格式的音频已经足够了。在产生的正玄波中的-1到+1之间是被移动和调整过的,这样声音的 字节数据可以用0到255来表示,同时128表示零振幅。
同时,在生成声音方面我们还要考虑另外一个问题。一般来讲,我们是通过正玄波的开关来生成莫斯代码。但是你直接这样来做的话,就会发现你生成的信号会占用非常大的带宽。所以,通常无线电设备会对其加以修正,以减少带宽占用。
在我们的项目中,也会做这样的修正,只不过是用数字的方式。既然我们已经知道了一个最小声音样本“嘀”的时间长度,那么,可以证明,最小带宽的声幅 发生在长度等于“嘀”的正玄波半周期。事实上,我们使用低通滤波器low pass filter)来过滤音频信号也能达到同样的效果。不过,既然我们已经知道所有的信号字符,我们直接简单的过滤一下每一个字符信号就可以了。
生成“嘀”、“嗒”和空白信号的PHP代码就像下面这样:
<ol class="dp-j"><li class="alt"><span><span class="keyword">while</span><span> ($dt < $DitTime) { </span></span></li><li><span> $x = Osc(); </span></li><li class="alt"><span> <span class="keyword">if</span><span> ($dt < (</span><span class="number">0.5</span><span>*$DitTime)) { </span></span></li><li><span> <span class="comment">// Generate the rising part of a dit and dah up to half the dit-time</span><span> </span></span></li><li class="alt"><span> $x = $x*sin((M_PI/<span class="number">2.0</span><span>)*$dt/(</span><span class="number">0.5</span><span>*$DitTime)); </span></span></li><li><span> $ditstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li class="alt"><span> $dahstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li><span> } </span></li><li class="alt"><span> <span class="keyword">else</span><span> </span><span class="keyword">if</span><span> ($dt > (</span><span class="number">0.5</span><span>*$DitTime)) { </span></span></li><li><span> <span class="comment">// For a dah, the second part of the dit-time is constant amplitude</span><span> </span></span></li><li class="alt"><span> $dahstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li><span> <span class="comment">// For a dit, the second half decays with a sine shape</span><span> </span></span></li><li class="alt"><span> $x = $x*sin((M_PI/<span class="number">2.0</span><span>)*($DitTime-$dt)/(</span><span class="number">0.5</span><span>*$DitTime)); </span></span></li><li><span> $ditstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li class="alt"><span> } </span></li><li><span> <span class="keyword">else</span><span> { </span></span></li><li class="alt"><span> $ditstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li><span> $dahstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li class="alt"><span> } </span></li><li><span> <span class="comment">// a space has an amplitude of 0 shifted to 128</span><span> </span></span></li><li class="alt"><span> $spcstr .= chr(<span class="number">128</span><span>); </span></span></li><li><span> $dt += $sampleDT; </span></li><li class="alt"><span> } </span></li><li><span><span class="comment">// At this point the dit sound has been generated</span><span> </span></span></li><li class="alt"><span><span class="comment">// For another dit-time unit the dah sound has a constant amplitude</span><span> </span></span></li><li><span>$dt = <span class="number">0</span><span>; </span></span></li><li class="alt"><span><span class="keyword">while</span><span> ($dt < $DitTime) { </span></span></li><li><span> $x = Osc(); </span></li><li class="alt"><span> $dahstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li><span> $dt += $sampleDT; </span></li><li class="alt"><span> } </span></li><li><span><span class="comment">// Finally during the 3rd dit-time, the dah sound must be completed</span><span> </span></span></li><li class="alt"><span><span class="comment">// and decay during the final half dit-time</span><span> </span></span></li><li><span>$dt = <span class="number">0</span><span>; </span></span></li><li class="alt"><span><span class="keyword">while</span><span> ($dt < $DitTime) { </span></span></li><li><span> $x = Osc(); </span></li><li class="alt"><span> <span class="keyword">if</span><span> ($dt > (</span><span class="number">0.5</span><span>*$DitTime)) { </span></span></li><li><span> $x = $x*sin((M_PI/<span class="number">2.0</span><span>)*($DitTime-$dt)/(</span><span class="number">0.5</span><span>*$DitTime)); </span></span></li><li class="alt"><span> $dahstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li><span> } </span></li><li class="alt"><span> <span class="keyword">else</span><span> { </span></span></li><li><span> $dahstr .= chr(floor(<span class="number">120</span><span>*$x+</span><span class="number">128</span><span>)); </span></span></li><li class="alt"><span> } </span></li><li><span> $dt += $sampleDT; </span></li><li class="alt"><span> } </span></li></ol>
WAVE格式的文件
WAVE是一种通用的音频格式。从最简单的形式来看,WAVE文件通过在头部包含一个整数序列来表示指定采样率的音频振幅。关于WAVE文件的详细信息请查看这里Audio File Format Specifications website。 对于产生莫斯代码,我们并不需要用到WAVE格式的所有参数选项,仅仅需要一个8位的单声道就可以了,所以,so easy。需要注意的是,多字节数据需要采用低位优先little-endian)的字节顺序。WAVE文件使用一种由叫做“块chunks)”的记 录组成的RIFF格式。
WAVE文件由一个ASCII标识符RIFF开始,紧跟着一个4字节的“块”,然后是一个包含ASCII字符WAVE的头信息,最后是定义格式的数据和声音数据。
在我们的程序中,第一个“块”包含了一个格式说明符,它由ASCII字符fmt和一个4倍字节的“块”。在这里,由于我使用的是普通脉冲编码调制 plain vanilla PCM)格式,所以每个“块”都是16字节。然后,我们还需要这些数据:声道数、声音采样/秒、平均字节/秒、一个区块block)对齐指示器、位 bit)/声音采样。另外,由于我们不需要高质量立体声,我们只采用单声道,我们使用 11050采样/秒标准的CD质量音频的采样率是 44200 采样/秒)的采样率来生成声音,并且用8位bit)保存。
最后,真实的音频数据储存在接下来的“块”中。其中包含ASCII字符data,一个4字节的“块”,最后是由字节序列因为我们采用的是8位(bit)/采样)组成的真实音频数据。
在程序中,由8位音频振幅序列组成的声音保存在变量$soundstr中。一旦音频数据生成完毕,就可以计算出所有的“块”大小,然后就可以把它们 合并在一起写入磁盘文件中。下面的代码展示了如何生成头信息和音频“块”。需要注意的是,$riffstr表示RIFF头,$fmtstr表示“块”格 式,$soundstr表示音频数据“块”。
<ol class="dp-j"><li class="alt"><span><span>$riffstr = </span><span class="string">'RIFF'</span><span>.$NSizeStr.</span><span class="string">'WAVE'</span><span>; </span></span></li><li><span>$x = SAMPLERATE; </span></li><li class="alt"><span>$SampRateStr = <span class="string">''</span><span>; </span></span></li><li><span><span class="keyword">for</span><span> ($i=</span><span class="number">0</span><span>; $i<</span><span class="number">4</span><span>; $i++) { </span></span></li><li class="alt"><span> $SampRateStr .= chr($x % <span class="number">256</span><span>); </span></span></li><li><span> $x = floor($x/<span class="number">256</span><span>); </span></span></li><li class="alt"><span> } </span></li><li><span>$fmtstr = <span class="string">'fmt '</span><span>.chr(</span><span class="number">16</span><span>).chr(</span><span class="number">0</span><span>).chr(</span><span class="number">0</span><span>).chr(</span><span class="number">0</span><span>).chr(</span><span class="number">1</span><span>).chr(</span><span class="number">0</span><span>).chr(</span><span class="number">1</span><span>).chr(</span><span class="number">0</span><span>) </span></span></li><li class="alt"><span> .$SampRateStr.$SampRateStr.chr(<span class="number">1</span><span>).chr(</span><span class="number">0</span><span>).chr(</span><span class="number">8</span><span>).chr(</span><span class="number">0</span><span>); </span></span></li><li><span>$x = $n; </span></li><li class="alt"><span>$NSampStr = <span class="string">''</span><span>; </span></span></li><li><span><span class="keyword">for</span><span> ($i=</span><span class="number">0</span><span>; $i<</span><span class="number">4</span><span>; $i++) { </span></span></li><li class="alt"><span> $NSampStr .= chr($x % <span class="number">256</span><span>); </span></span></li><li><span> $x = floor($x/<span class="number">256</span><span>); </span></span></li><li class="alt"><span> } </span></li><li><span>$soundstr = <span class="string">'data'</span><span>.$NSampStr.$soundstr; </span></span></li></ol>
总结和评论
我们的文本莫斯代码生成器目前看起来还不错。当然,我们还可以对它做很多的修改和完善,比如使用其他字符集、直接从文件中读取文本、生成压缩音频等等。因为我们这个项目的目的是使其能够在网络上方便的使用,所以我们这个简单的方案,已经达到我们的目的了。
当然,一如既往的,希望大家对这些简单粗暴的代码提出建议。这些年来虽然一直有人在教我,但我还是缺乏莫斯代码相关背景知识,所以,如果出现任何的错误或遗漏都算是我的错。
译文链接:http://www.codeceo.com/article/php-morse-code-generation.html
英文原文:Morse Code Generation from Text