Photosounder

4 November 2009

New instrument isolation techniques

During the last few weeks, Photosounder received the additions of new functions that allow for instrument isolation to be done all in Photosounder, simply and quickly, without the use of any external program.

In this video we see how to extract a continuous synth line graphically using those new tools. This synthesiser is harmonic, which means it is composed of vertically stacked parallel lines, separated by constant vertical distances. It is crucial to be able to identify the position of the base frequency, which is the lowest of those lines. It might not always be easily seen, and sometimes it's entirely absent. To help you find that base frequency, you can, using any brush tool and the harmonics modifier (the button with four vertically stacked dots), hover over the image to see an overlay of the first few harmonics and with your mouse cursor try to match the cross hair overlay with the lines on the screen. This is fairly straightforward, however things can sometimes be a bit confusing. It is best to try what seems like the lowest possibility first in order to avoid confusing the first harmonic (the second line from the bottom) with the base frequency (the first line), but in the case of a chord, it is best to try to erase the higher notes first.

In this example there are no chords, and the base frequency is easily seen. The synth line is also quite strong compared to the rest of the instruments, meaning we can safely use the magnet modifier so that the cursor will effortlessly snap to the synth's curves.

Using the smart erase tool (represented by a road roller icon), the harmonics modifier that reproduces the smart erase tool's action on every harmonic, and the magnet modifier to snap to the curves we can now erase the synth by spraying over it from left to right. The 'Tool intensity' should be set to 100%, the 'Spray width" anywhere between 10 and 20 pixels. Also you'll most likely want to hold the H key during the erasure. This slows down the mouse cursor 32 times by default as to give you more precision.

This gives you an image and sound practically devoid of the removed instrument. We want the opposite, which we obtain by pressing the Mask Invert button. Make sure the lossless mode is turned on for best results.

At this point the result might not be fully satisfactory, but this is most of the job done. Further work can be done to clean up/fix the image further, included using external programs such as Adobe Photoshop or GIMP.

Horns, such as in this video, offer a different kind of challenge. Identifying the base frequency can be more tricky, one reason being that the base frequency can be pretty low in pitch, giving it a lower graphical resolution to work with. Also the lines that make up a horn note are less smooth and regular, however this in turn is an advantage, it makes the result more forgiving to irregularities.

Because of these characteristics, it is recommended to change a few parameters in the file config.txt. In this video the min_bpo setting was changed to 12 instead of 24 to have a better time resolution in the area of the base frequencies of the horns. The pixels_per_second parameter was lowered from 100 to 50 because overall not so much time resolution is needed here. More importantly, the bands_per_octave parameter which defines the vertical resolution was increased from 60 to 180, this is because the harmonics of the horns reach quite high, and as harmonics go up they get closer to each other. With a bands_per_octave setting of 60, after the 30th harmonic or so all harmonics are merged together. Increasing that setting allows them to remain separated and hence more readily separable. For the same reason, the spray_harmonics setting which defines how many harmonics the harmonics modifier works on was increased from a default of 20 to 100.

The instruments were removed in the same way as in the previous video, with a few exceptions. Firstly, it's a bit harder to identify the base frequencies of each note in this sample, but it was also harder to identify what belong to which instrument, so it took a bit of trial-and-error. Also, the magnet modifier was turned off, for two reasons: because the instrument being removed was less strong the magnet modifier was less efficient, and because most notes are straight lines it's just as easy to follow without the magnet modified. The H key was still held down for precision.

After the erasing the desired instruments and doing the Mask Invert, we can notice a couple of issues with the image and the sound. We can see and hear remains of the hi hats which were caught up in the higher harmonics, and we also notice that the chords are much louder than the parts with single notes. The first issue can be simply be solved using the dark spray tool, without any modifier turned on, and with a decreased Tool intensity. This editing is best done by temporarily turning up the Gamma as to see better what's being done.

The second issue comes from the fact that when we remove chords we make as many passes as there are notes, and because there is much overlap in the harmonics it is equivalent to removing the same thing many times, which results in louder chords than should be. This can be solved using the rectangle tool. Using it, by dragging an area on the screen by pressing the right mouse click (which lightens the area by the ratio defined by Tool intensity, the left mouse click in turn darkens the area by the same ratio), you can make the passages devoid of chords brighter and in turn louder.

As said earlier, this type of instrument is more forgiving than the smooth flat synth line in the first example, and it takes less work to obtain a satisfactory result. Again however, the results can be further improved.

Labels: experiments, filtering, lossless, tutorial

13 September 2009

Basic Vocoding with Photoshop

Photosounder shares a lot of principles with vocoders. Vocoders, as used in music to make robotic voices, work by slicing the input voice (the modulator) and the input tone (the carrier signal) into narrow bands of frequency, detecting the envelopes for the modulator's bands, and modulating the bands of the carrier with these envelopes. This makes the vocoded signal inherit the tone of the carrier signal and a number of characteristics of the modulator, which translates into intelligible speech that sounds like anything but a human voice.

If you're familiar with how Photosounder works, you can probably draw the parallels. If not, this is how it works. Both Photosounder and vocoders cut the input signals into narrow bands of frequency, and Photosounder detects their envelope to form an image that represents the sound. In lossless mode, Photosounder also keeps the filtered bands somewhere in memory to modulate them with any eventual image input. Therefore, vocoding can be done by multiplying the image of the modulator with the image of the carrier signal, and be used in lossless mode with the original carrier signal as the reference sound, so that the modifications done to the carrier image (which is, the multiplication by the modulator image) can be applied to the carrier signal.

However, traditional vocoders use a much lower resolution in the frequency domain, whereas Photosounder uses by default 571 frequency bands, vocoders use typically between 8 and 32, over the same range of frequencies. This means that whereas in Photosounder you can clearly distinguish each harmonic that makes up human speech, to a vocoder these are all fused together. And that's actually what we want, because we don't want to keep any information about the input voice's vocal chords, we want to replace the vocal chords with the carrier signal and apply to it the same treatment as the raw sound from the vocal chords received, which was turned from a meaningless "aaaaaaaaah" to intelligible speech.

This is solved by applying a vertical motion blur to the modulator's image in Photoshop. In the video I used a vertical motion blur of 20 pixels three times. Also, since frequency resolution here is not important whereas time resolution is, it is advised that you edit config.txt in the Photosounder folder and change the value of min_bpo to 0 instead of 24.

This of course is only basic vocoding. One could just stretch the modulator image around in all directions and in all sorts of ways prior to overlaying it with the carrier image. It will be the subject of my next blog entry, if I can find any example sounds that suit me.

Labels: experiments, lossless, tutorial, vocoding

22 July 2009

Graphical sound denoising challenge results

And the winner is Iain Fergusson with the following entry made with GIMP:

Extract of the original song:

Iain's result:

Iain obtained this excellent result and won a free copy of Photosounder worth €99 by following these steps :

Find selection of sound which is just noise, copy and paste to new layer

Pixelate it with pixel height 1, and width as wide as the layer is

Resize noise selection layer to fit entire image

Set noise selection layer to 'subtract' - adjust curves if you need more subtraction

'Copy visible'

Paste this into a mask on the original image

Create black layer below the original

Adjust curves on original layer mask to push light parts to white, and carefully, push the very darkest parts to black

Save image

You can download the full denoised song here.

Labels: challenge, denoising, experiments, filtering, lossless

2 July 2009

Graphical sound denoising challenge

Removing noise from recorded sound has always been a difficult problem, requiring the use of specific electronic circuits or dedicated computer algorithms. With the recent advent of image-based processing of sound it is now possible to tackle this problem from a different angle with such simple and ubiquitous tools as image editing programs. This is the object of this challenge, denoising sound using graphical techniques.

The sound chosen for this challenge is a 1894 recording of Daisy Bell by Edward M. Favor. Dating from the early days of sound recording, it's suffers from heavy noise and artifacts. The goal of this challenge is to remove these undesirable features in a graphical way while preserving the vocal and musical elements in order to enhance the sound quality of this recording.

An extract from the recording

This is an example of the original extract being denoised graphically. It was done in Photoshop in a few minutes using some very simple operations.

Prizes :
The prizes are two full licenses of Photosounder each worth €99, one for each of the following category of entries :

The image-editor category : For entries done entirely with an image-editor and reproducible by any user of such a program.

The algorithmic category : For any other entry, but more particularly for entries involving the use of custom-written image-processing code or any process beyond the usage of publicly available user-level tools.

Deadline:
All entries must be sent by e-mail to challenge@photosounder.com before July 16th at noon GMT. The entries will then be reviewed by a panel of listeners and the results will be announced a few days later.

What you'll need :

The demo version of Photosounder for Windows/Mac OS X. The demo version doesn't allow you to save the sound but it allows you to save the image which is all you'll need here.

The original recording which you can download here. A short extract and its image have been included for the sake of convenience during experimentation.

An image editor such as Adobe Photoshop or GIMP, unless of course you wish to write your own algorithm.

Rules :

A valid entry must consist in the resulting image of dimensions 15,379 x 571, preferably in PNG format, as well as an account of how the image was obtained detailed in a way that would allow the reproduction of the process, and sent by e-mail to challenge@photosounder.com before the deadline.

Your denoising method must be practical to use on long sounds and be reproducible in a few minutes of work on a sound of several minutes. Therefore this excludes the recourse to such techniques as paintbrushing parts of the image out.

Tips :

To hear your results the way they should be heard make sure to use Photosounder's lossless mode. To do so first load the original sound in Photosounder then load the modified image corresponding to that sound and activate the lossless mode.

If in your sound you hear artifacts similar to bubbling it may be that in your modified image so pixels are much brighter than they originally were. To make sure it doesn't happen you can overlay your modified image with a copy of the original image and set the blending mode of the original image to 'Darken' so that it prevents any single pixel from being brighter than it originally was.

Labels: challenge, denoising, experiments, filtering, lossless

13 April 2009

Tutorial - Instrument Isolation (Funky Worm)

A few weeks ago I posted a video/blog entry showing the results of instrument isolation in Photoshop. Here is a tutorial showing how you can reproduce it.

I. Turning a sound into image
- Cut the piece of sound you want to work on in your favourite audio editor and save it to a file
- Open that file with Photosounder
- Once the image is done loading up on the screen, press the Save button and select "Image file" in the drop-down menu

II. Synth removal
- Open that image file in Photoshop (or similar image editor like GIMP)
- Invert the colours for the sake of visibility (Ctrl+I)
- Select the Clone tool, and set it to a size of 4 pixels, and a hardness about 70%
- Make sure the Aligned box is ticked, hold Alt, click somewhere on the image, release Alt, and click again a dozen pixels right above the point you previous clicked. Also, make sure the Mode is set to Lighten.
- Erase the lowest line that belongs to the synth this way
- Then proceed to erase all the lines above this way. At some point you might choose to have your source above where you spray instead of under, or just make the source closer to where you spray.
- When you're done removing all the synth lines, Invert (Ctrl+I) then Save (Ctrl+S)

III. Listening to the results
- Make sure Photosounder is loaded with the original sound
- Load the image you just edited and saved
- Press the Lossless mode button so that it's ON
- Press Play once the blue progress bar above the image is entirely dark blue

IV. Isolation
- Copy the synth-less image and paste it on a new layer on top of the original image
- Invert both layers (so that their backgrounds are both white)
- Set the image to 16-bit mode
- In Levels set on both their Gamma (the central value) to 2.00
- Set the blending mode to Difference
- Flatten the image
- Invert it
- In Levels set the Gamma to 0.5
- Turn back to 8-bit mode
- Invert again and save the image file
- Reload the image file in Photosounder (press R) and listen

V. Clean up
- Invert
- Clean the noise around the lines with a tiny (about 4 px) white brush
- Fix the holes in straight lines with the clone tool
- Select the highest line which you could fully fix, then copy it and move it upwards in the place of the incomplete lines
- Change the intensity of each copy of the model line with Levels so they fit the intensity of the underlying line
- Flatten
- Invert
- Reload the image in Photosounder
- Save the audio file by pressing Save

If there's any aspect of this that requires clarification do ask about it in the comments.

Labels: experiments, filtering, lossless, tutorial

21 March 2009

Instrument Isolation (Funky Worm)

This is how I isolated the main instrument from Ohio Players' Funky Worm using Photosounder and Photoshop, as show in this video. I first loaded the original sound's image into Photoshop and using the clone tool I erased the lines matching to the instrument I wanted to isolate. That new image, once loaded in Photosounder in lossless mode using the original sound, gave me this drums and vocals-only version :

Back in Photoshop, I pasted the new image on top of the original one, switched to 16-bit mode for precision, corrected the gamma for both of them so they match to 1:1. However beware, Photoshop's Levels makes dark pixels darker than they should be when you increase the gamma, which has disastrous effects on pictures as dark as what we have here. Which is why it's best to invert the pictures so that their background turns to white before doing such corrections. Once inverted, you need a value of 2.0 in Levels' gamma, on both layers, then choose the Difference blending mode, flatten the image, invert again, apply a gamma of 0.5.

We now only have the bits we previously erased, and we can see what has to be done. With the example I chose I had entire missing areas matching to where the snare drums used to be, burying the overtones of the instrument of interest into noise, making them disappear. The fact that I used an MP3 as a basis only made matters worse. I also had things that didn't belong, mainly pieces of voice I mistook as belonging to my instrument. The rest of the work consisted in cleaning and fixing the image, using my best Photoshopping skills.

Note that when you're done, you might want to double-pass the processing to obtain a result more faithful to the actual image you obtained. To do that, normally load the image in lossless mode in Photosounder with the original sound as a basis, save the resulting sound, then open the very same sound file again, reopen the image, and save the sound.

Isolated main instrument

Edit : In this file you will find the original sample used as well as the two images needed to recreate the results shown above, along with detailed instructions on how to do that using the Photosounder Demo or the full version of Photosounder.

Also you can find a tutorial on how to reproduce this here

Labels: experiments, filtering, lossless