Examples
Here are some examples of what you can obtain with the ARSS 0.2.3. The quality of the results presented here does not reflect the ever-improving quality of future releases of the program.
The two following examples demonstrate the ARSS's capability to reproduce a sound from its spectrogram. Here, the first sound icon is a link to the original sound re-encoded in MP3, the image in the middle is a link to the full image obtained by analysis of the first sound, re-encoded in PNG and possibly slightly edited for the sake of visibility, and the last icon represents the sound obtained by synthesis of the sole aforementioned image.
Units :
- bpo : Bands per octave. That's the frequency resolution. For
example, 24 bpo means there is vertically 24 pixels for each octave,
which implies that the distance between two pixels is half a semi-tone.
- pps : Pixels per second. That's the time resolution. For example,
150 pps means there is horizontally 150 pixels per second, which
implies that the distance between two pixels is 1/150th of a second.
The following examples show what kinds of sounds one can obtain by creating spectrograms.
Caption | Original spectrogram | Synthesised sound | |||
HAL 9000 hand-drawn in Photoshop This spectrogram has been created in about 15 minutes in Photoshop with the brush tool by following the lines and imitating the other features of the HAL 9000 spectrogram presented previously. We can understand quite distinctively what the voice says, which is almost surprising, considered how quickly and carelessly this has been executed. This leads me to think that one could easily learn how to draw every phoneme, and thus create a clear speech from scratch. Parameters :
|
|||||
Roboty tune made from DNA gel Few real world pictures fed to the ARSS come out as interesting sounds, and this photograph of DNA gel (originally taken from this page) is one of them. It is thanks to its short horizontal lines, well stacked together vertically, the whole on a black background, that this picture turns into a series of short and distinct notes making up a strangely catchy robotic-sounding melody. Parameters :
|
The following effects have been obtained simply by resynthesis of the original sound's intact spectrogram merely by using different parameters for synthesis.
Caption | Original sound | Produced spectrogram | Resynthesised sound | ||
Time stretching : slowing down Scatman John's scat slowed down 5 times
Parameters :
|
|||||
Time stretching : speeding up President Bush's 2008 State of the Union Address sped up 100 times
Parameters :
|
|||||
Interval stretching Samuel Barber's Adagio for Strings stretched out by a factor of 2
So for example if you stretched out the notes C3-D3-G3 by a factor of 2 using that you might obtain the notes C3-E3-D4, or depending on other settings you might as well obtain A5-C#5-B6. The important point is that the interval between two notes is doubled, and in our precise example, we stretch our sound from 4.77 octaves to 9.53 octaves. While I chose here to double intervals for harmonic reasons, you can also chose to reduce them. It usually turns anything into eerie-sounding dissonant music. Parameters :
|
The following example shows how an image editing program can be used to achieve things previously impossible in sound processing.
The following demonstrates how can images be transmitted over sound nearly losslessly under ideal conditions.
Caption | Original image | Transmitted sound | Transmission result | ||
Basic black and white image transmission Lena transmitted over MP3
There are a few things to note about this very example. Because the final image is produced from the actual MP3, as opposed to a lossless reproduction of the synthesised sound, and because such a sound contains much more information than regular music, the MP3 is encoded using a bitrate 4 times larger than the usual one used for music. If we had used a lower bitrate, the image would have been very noisy, and with an even lower bitrate entire chunks of the image would have been blacked out. This is due to the fact that this type of sound contains a lot more information than the MP3 format was designed for. It may also be of interest to note that this method of image transmission is actually as efficient as the method used for analog black and white television transmission, which means that we could theoretically transmit TV programs using this method within the same bandwidth as analog television, and with the same quality, under ideal conditions. One of the interesting aspects of this technique is that the images transmitted like this can be picked up and viewed by anyone with a spectrograph, and given the arguable universality of mathematics and time-frequency analysis, one may go as far as arguing that it would be a good way of transmitting images to eventual extraterrestrial civilisations, as we may expect them to be acquainted with such analysis techniques and to use them at some point when analysing strange unusual signals from outer space. Back on Earth, you could try the following. Ask someone you know to give you a phone call and to play this sound. Record it, analyse it (with the following parameters : 300 Hz to 3400 Hz, height 256 pixels, 10 pps, linear) and if you see anything you recognize please send it to me (I don't have a telephone and I'd like to know how well it works). Parameters :
|