Photosounder

4 November 2009

New instrument isolation techniques

During the last few weeks, Photosounder received the additions of new functions that allow for instrument isolation to be done all in Photosounder, simply and quickly, without the use of any external program.

In this video we see how to extract a continuous synth line graphically using those new tools. This synthesiser is harmonic, which means it is composed of vertically stacked parallel lines, separated by constant vertical distances. It is crucial to be able to identify the position of the base frequency, which is the lowest of those lines. It might not always be easily seen, and sometimes it's entirely absent. To help you find that base frequency, you can, using any brush tool and the harmonics modifier (the button with four vertically stacked dots), hover over the image to see an overlay of the first few harmonics and with your mouse cursor try to match the cross hair overlay with the lines on the screen. This is fairly straightforward, however things can sometimes be a bit confusing. It is best to try what seems like the lowest possibility first in order to avoid confusing the first harmonic (the second line from the bottom) with the base frequency (the first line), but in the case of a chord, it is best to try to erase the higher notes first.

In this example there are no chords, and the base frequency is easily seen. The synth line is also quite strong compared to the rest of the instruments, meaning we can safely use the magnet modifier so that the cursor will effortlessly snap to the synth's curves.

Using the smart erase tool (represented by a road roller icon), the harmonics modifier that reproduces the smart erase tool's action on every harmonic, and the magnet modifier to snap to the curves we can now erase the synth by spraying over it from left to right. The 'Tool intensity' should be set to 100%, the 'Spray width" anywhere between 10 and 20 pixels. Also you'll most likely want to hold the H key during the erasure. This slows down the mouse cursor 32 times by default as to give you more precision.

This gives you an image and sound practically devoid of the removed instrument. We want the opposite, which we obtain by pressing the Mask Invert button. Make sure the lossless mode is turned on for best results.

At this point the result might not be fully satisfactory, but this is most of the job done. Further work can be done to clean up/fix the image further, included using external programs such as Adobe Photoshop or GIMP.

Horns, such as in this video, offer a different kind of challenge. Identifying the base frequency can be more tricky, one reason being that the base frequency can be pretty low in pitch, giving it a lower graphical resolution to work with. Also the lines that make up a horn note are less smooth and regular, however this in turn is an advantage, it makes the result more forgiving to irregularities.

Because of these characteristics, it is recommended to change a few parameters in the file config.txt. In this video the min_bpo setting was changed to 12 instead of 24 to have a better time resolution in the area of the base frequencies of the horns. The pixels_per_second parameter was lowered from 100 to 50 because overall not so much time resolution is needed here. More importantly, the bands_per_octave parameter which defines the vertical resolution was increased from 60 to 180, this is because the harmonics of the horns reach quite high, and as harmonics go up they get closer to each other. With a bands_per_octave setting of 60, after the 30th harmonic or so all harmonics are merged together. Increasing that setting allows them to remain separated and hence more readily separable. For the same reason, the spray_harmonics setting which defines how many harmonics the harmonics modifier works on was increased from a default of 20 to 100.

The instruments were removed in the same way as in the previous video, with a few exceptions. Firstly, it's a bit harder to identify the base frequencies of each note in this sample, but it was also harder to identify what belong to which instrument, so it took a bit of trial-and-error. Also, the magnet modifier was turned off, for two reasons: because the instrument being removed was less strong the magnet modifier was less efficient, and because most notes are straight lines it's just as easy to follow without the magnet modified. The H key was still held down for precision.

After the erasing the desired instruments and doing the Mask Invert, we can notice a couple of issues with the image and the sound. We can see and hear remains of the hi hats which were caught up in the higher harmonics, and we also notice that the chords are much louder than the parts with single notes. The first issue can be simply be solved using the dark spray tool, without any modifier turned on, and with a decreased Tool intensity. This editing is best done by temporarily turning up the Gamma as to see better what's being done.

The second issue comes from the fact that when we remove chords we make as many passes as there are notes, and because there is much overlap in the harmonics it is equivalent to removing the same thing many times, which results in louder chords than should be. This can be solved using the rectangle tool. Using it, by dragging an area on the screen by pressing the right mouse click (which lightens the area by the ratio defined by Tool intensity, the left mouse click in turn darkens the area by the same ratio), you can make the passages devoid of chords brighter and in turn louder.

As said earlier, this type of instrument is more forgiving than the smooth flat synth line in the first example, and it takes less work to obtain a satisfactory result. Again however, the results can be further improved.

Labels: experiments, filtering, lossless, tutorial

13 September 2009

Basic Vocoding with Photoshop

Photosounder shares a lot of principles with vocoders. Vocoders, as used in music to make robotic voices, work by slicing the input voice (the modulator) and the input tone (the carrier signal) into narrow bands of frequency, detecting the envelopes for the modulator's bands, and modulating the bands of the carrier with these envelopes. This makes the vocoded signal inherit the tone of the carrier signal and a number of characteristics of the modulator, which translates into intelligible speech that sounds like anything but a human voice.

If you're familiar with how Photosounder works, you can probably draw the parallels. If not, this is how it works. Both Photosounder and vocoders cut the input signals into narrow bands of frequency, and Photosounder detects their envelope to form an image that represents the sound. In lossless mode, Photosounder also keeps the filtered bands somewhere in memory to modulate them with any eventual image input. Therefore, vocoding can be done by multiplying the image of the modulator with the image of the carrier signal, and be used in lossless mode with the original carrier signal as the reference sound, so that the modifications done to the carrier image (which is, the multiplication by the modulator image) can be applied to the carrier signal.

However, traditional vocoders use a much lower resolution in the frequency domain, whereas Photosounder uses by default 571 frequency bands, vocoders use typically between 8 and 32, over the same range of frequencies. This means that whereas in Photosounder you can clearly distinguish each harmonic that makes up human speech, to a vocoder these are all fused together. And that's actually what we want, because we don't want to keep any information about the input voice's vocal chords, we want to replace the vocal chords with the carrier signal and apply to it the same treatment as the raw sound from the vocal chords received, which was turned from a meaningless "aaaaaaaaah" to intelligible speech.

This is solved by applying a vertical motion blur to the modulator's image in Photoshop. In the video I used a vertical motion blur of 20 pixels three times. Also, since frequency resolution here is not important whereas time resolution is, it is advised that you edit config.txt in the Photosounder folder and change the value of min_bpo to 0 instead of 24.

This of course is only basic vocoding. One could just stretch the modulator image around in all directions and in all sorts of ways prior to overlaying it with the carrier image. It will be the subject of my next blog entry, if I can find any example sounds that suit me.

Labels: experiments, lossless, tutorial, vocoding

22 July 2009

Graphical sound denoising challenge results

And the winner is Iain Fergusson with the following entry made with GIMP:

Extract of the original song:

Iain's result:

Iain obtained this excellent result and won a free copy of Photosounder worth €99 by following these steps :

Find selection of sound which is just noise, copy and paste to new layer

Pixelate it with pixel height 1, and width as wide as the layer is

Resize noise selection layer to fit entire image

Set noise selection layer to 'subtract' - adjust curves if you need more subtraction

'Copy visible'

Paste this into a mask on the original image

Create black layer below the original

Adjust curves on original layer mask to push light parts to white, and carefully, push the very darkest parts to black

Save image

You can download the full denoised song here.

Labels: challenge, denoising, experiments, filtering, lossless

2 July 2009

Graphical sound denoising challenge

Removing noise from recorded sound has always been a difficult problem, requiring the use of specific electronic circuits or dedicated computer algorithms. With the recent advent of image-based processing of sound it is now possible to tackle this problem from a different angle with such simple and ubiquitous tools as image editing programs. This is the object of this challenge, denoising sound using graphical techniques.

The sound chosen for this challenge is a 1894 recording of Daisy Bell by Edward M. Favor. Dating from the early days of sound recording, it's suffers from heavy noise and artifacts. The goal of this challenge is to remove these undesirable features in a graphical way while preserving the vocal and musical elements in order to enhance the sound quality of this recording.

An extract from the recording

This is an example of the original extract being denoised graphically. It was done in Photoshop in a few minutes using some very simple operations.

Prizes :
The prizes are two full licenses of Photosounder each worth €99, one for each of the following category of entries :

The image-editor category : For entries done entirely with an image-editor and reproducible by any user of such a program.

The algorithmic category : For any other entry, but more particularly for entries involving the use of custom-written image-processing code or any process beyond the usage of publicly available user-level tools.

Deadline:
All entries must be sent by e-mail to challenge@photosounder.com before July 16th at noon GMT. The entries will then be reviewed by a panel of listeners and the results will be announced a few days later.

What you'll need :

The demo version of Photosounder for Windows/Mac OS X. The demo version doesn't allow you to save the sound but it allows you to save the image which is all you'll need here.

The original recording which you can download here. A short extract and its image have been included for the sake of convenience during experimentation.

An image editor such as Adobe Photoshop or GIMP, unless of course you wish to write your own algorithm.

Rules :

A valid entry must consist in the resulting image of dimensions 15,379 x 571, preferably in PNG format, as well as an account of how the image was obtained detailed in a way that would allow the reproduction of the process, and sent by e-mail to challenge@photosounder.com before the deadline.

Your denoising method must be practical to use on long sounds and be reproducible in a few minutes of work on a sound of several minutes. Therefore this excludes the recourse to such techniques as paintbrushing parts of the image out.

Tips :

To hear your results the way they should be heard make sure to use Photosounder's lossless mode. To do so first load the original sound in Photosounder then load the modified image corresponding to that sound and activate the lossless mode.

If in your sound you hear artifacts similar to bubbling it may be that in your modified image so pixels are much brighter than they originally were. To make sure it doesn't happen you can overlay your modified image with a copy of the original image and set the blending mode of the original image to 'Darken' so that it prevents any single pixel from being brighter than it originally was.

Labels: challenge, denoising, experiments, filtering, lossless

15 May 2009

Motion blur sound reverberation

Out of the many possible approaches to processing sounds using Photosounder, there is one particular approach that had yet to be tested, I'm talking about additive effects. These additive effects consist in processing a sound to then mix the result with the original sound. The next series of examples will demonstrate how to graphically create such an effect to achieve something somewhat similar to sound reverberation.

One way to do that is to apply an horizontal blur to a sound's image, then shift it to the right so that the blurred sound doesn't play ahead of the sounds in the original sound. It is typically done effortlessly in a very few minutes. Here is how I operated for the following examples :
-Open a sound in Photosounder
-Save the image
-Open the image in your favourite image editor (Photoshop, GIMP, etc..)
-Duplicate the image's first layer (which we'll keep as a reference)
-Apply an horizontal blur to the new layer (Filter > Blur > Motion Blur... in Photoshop) of about 30 pixels
-Apply the same blur again 2 or 3 times
-Set the layer's blending mode to Lighten so that you can see through it, and move it to the right so that notes in our blurry layer don't start before they hit on the original first layer
-Duplicate that layer
-Blur the copy some more
-Shift it to the right some more so that it doesn't start before the original notes hit
-Optionally adjust the luminosity of that layer so that it can be more intense
-Hide the original layer so that we only see the two blurry layers merged together
-Save the image and open it in Photosounder
-Save the sound it produces to a file
-Open the new "blurry" sound and the original sound in an audio editing program and mix the two sounds together (no timing offset is required)

One of the interesting aspects of following these steps is that they involve doing the very exact same thing for every sound, meaning that one could just record an action in Photoshop and reproduce the whole process at the press of a key. Of course this is just one way to do it, the goal here being to make the shorter blur start right after the notes originally hit, and the longer blur to hold the notes so they can last much longer and slowly fade.

A few examples

Stefon Harris' Until in its original form followed by its "blurred" form.

Same thing with this Rhodes piano piece.

Since the part that is added to the original sound is entirely contained in an image, we have the freedom to get creative and do practically anything we want to do with it. Here as an example the image was shifted up by 60 pixels so that the reverberated sound is one octave higher, thus achieving a different effect.

This of course works not just one musical instruments but on all types of sounds, including speech. Here as an added twist, the blurry image was synthesised twice and the two sound files were put together as one stereo sound, giving the resulting sound great stereophonic qualities.

Labels: additive effect, effect, experiments

12 May 2009

Time pixelation on sound

For this series of examples I chose to experiment with what I call time pixelation of sound. It consists in taking the image of a sound, pixelating it horizontally only, then turning it back into a sound.

It is simply achieved by doing the following :
-Open a sound in Photosounder
-Save the image
-Open the image in your favourite image editor (Photoshop, GIMP, MS Paint, etc..)
-Squeeze your image horizontally to a given width, make sure to choose the Nearest Neighbour method
-Stretch your image horizontally back to its original width, make sure to use the Nearest Neighbour method again so the result looks "blocky"
-Open the modified image in Photosounder

(All of the above can be reproduced using the Photosounder Demo)

This has the effect of taking a short bit of the initial sound at regular intervals, and stretching it in time. For varying results, experiment with the width you squeeze the image to, but also with the horizontal offset in the original image. Moving the image a few pixels to the left or right results in the resizing method to pick different columns of pixels.

A few examples

This is 2001: A Space Odyssey's HAL 9000 speaking with only 10 pixels a second.

Once pixelated to about 3 pixels a second and looped back and forth, speech can turn out to adopt some almost catchy musical qualities.

A strings sample (above) reduced to 16 pixels (below) :

The arpeggio at the beginning of The Animals' House of the Rising Sun, first reduced to 6 pixels, then 12 pixels, 24, 48, 96 and finally the original full 1200 pixel image. It is interesting to note how this process can "de-arpeggiate" an arpeggio by selecting only notes at regular intervals.

Labels: effect, experiments, time

13 April 2009

Tutorial - Instrument Isolation (Funky Worm)

A few weeks ago I posted a video/blog entry showing the results of instrument isolation in Photoshop. Here is a tutorial showing how you can reproduce it.

I. Turning a sound into image
- Cut the piece of sound you want to work on in your favourite audio editor and save it to a file
- Open that file with Photosounder
- Once the image is done loading up on the screen, press the Save button and select "Image file" in the drop-down menu

II. Synth removal
- Open that image file in Photoshop (or similar image editor like GIMP)
- Invert the colours for the sake of visibility (Ctrl+I)
- Select the Clone tool, and set it to a size of 4 pixels, and a hardness about 70%
- Make sure the Aligned box is ticked, hold Alt, click somewhere on the image, release Alt, and click again a dozen pixels right above the point you previous clicked. Also, make sure the Mode is set to Lighten.
- Erase the lowest line that belongs to the synth this way
- Then proceed to erase all the lines above this way. At some point you might choose to have your source above where you spray instead of under, or just make the source closer to where you spray.
- When you're done removing all the synth lines, Invert (Ctrl+I) then Save (Ctrl+S)

III. Listening to the results
- Make sure Photosounder is loaded with the original sound
- Load the image you just edited and saved
- Press the Lossless mode button so that it's ON
- Press Play once the blue progress bar above the image is entirely dark blue

IV. Isolation
- Copy the synth-less image and paste it on a new layer on top of the original image
- Invert both layers (so that their backgrounds are both white)
- Set the image to 16-bit mode
- In Levels set on both their Gamma (the central value) to 2.00
- Set the blending mode to Difference
- Flatten the image
- Invert it
- In Levels set the Gamma to 0.5
- Turn back to 8-bit mode
- Invert again and save the image file
- Reload the image file in Photosounder (press R) and listen

V. Clean up
- Invert
- Clean the noise around the lines with a tiny (about 4 px) white brush
- Fix the holes in straight lines with the clone tool
- Select the highest line which you could fully fix, then copy it and move it upwards in the place of the incomplete lines
- Change the intensity of each copy of the model line with Levels so they fit the intensity of the underlying line
- Flatten
- Invert
- Reload the image in Photosounder
- Save the audio file by pressing Save

If there's any aspect of this that requires clarification do ask about it in the comments.

Labels: experiments, filtering, lossless, tutorial

22 March 2009

Bongo drum, handmade

While attempting to create a whole drum kit by hand using Photoshop a few months ago I created this bongo drum, which sound I find quite satisfactory. It was accomplished by loading a few bongo drum samples in Photosounder to see how a bongo drum typically looks like, and then recreating what I saw in Photoshop.

A handmade bongo drum

There are 3 main features, as shown on the picture to the right : a few carefully spaced horizontal lines that give the bongo drum its characteristic metallic ringing sort of sound,

two short bright blobs placed at the beginning of the sound under the two lowest lines which give the sound its initial punch, and finally a darker snare-like haze which give the initial slappy-almost-clicky start and evolves in vertical span and intensity to sound like the ringing from kicking into a barrel. Of course on top of it all we find a black rectangle which overlays the first half of what we drew as to give our sound its sharp attack. The 3 separate features as show on the image sound like this (played at 205 pixels/second) :

Full original multi-layered Photoshop file

Labels: experiments, image synthesis, instrument creation

21 March 2009

Instrument Isolation (Funky Worm)

This is how I isolated the main instrument from Ohio Players' Funky Worm using Photosounder and Photoshop, as show in this video. I first loaded the original sound's image into Photoshop and using the clone tool I erased the lines matching to the instrument I wanted to isolate. That new image, once loaded in Photosounder in lossless mode using the original sound, gave me this drums and vocals-only version :

Back in Photoshop, I pasted the new image on top of the original one, switched to 16-bit mode for precision, corrected the gamma for both of them so they match to 1:1. However beware, Photoshop's Levels makes dark pixels darker than they should be when you increase the gamma, which has disastrous effects on pictures as dark as what we have here. Which is why it's best to invert the pictures so that their background turns to white before doing such corrections. Once inverted, you need a value of 2.0 in Levels' gamma, on both layers, then choose the Difference blending mode, flatten the image, invert again, apply a gamma of 0.5.

We now only have the bits we previously erased, and we can see what has to be done. With the example I chose I had entire missing areas matching to where the snare drums used to be, burying the overtones of the instrument of interest into noise, making them disappear. The fact that I used an MP3 as a basis only made matters worse. I also had things that didn't belong, mainly pieces of voice I mistook as belonging to my instrument. The rest of the work consisted in cleaning and fixing the image, using my best Photoshopping skills.

Note that when you're done, you might want to double-pass the processing to obtain a result more faithful to the actual image you obtained. To do that, normally load the image in lossless mode in Photosounder with the original sound as a basis, save the resulting sound, then open the very same sound file again, reopen the image, and save the sound.

Isolated main instrument

Edit : In this file you will find the original sample used as well as the two images needed to recreate the results shown above, along with detailed instructions on how to do that using the Photosounder Demo or the full version of Photosounder.

Also you can find a tutorial on how to reproduce this here

Labels: experiments, filtering, lossless