Enter the maze

Simon says: no to autotune

listening to a song on headphones

The papers were alive with the sound of autotune when it was suggested that the X Factor had used a digital sound processing technique called autotune to improve the sounds of the singers on the show.

The story raged for a few days. Was using computer technology to improve singing right? The argument was that it's all over the charts already. Singers from T-Pain to JLS to the kids from Glee use it, so what's the problem? But the public was having none of it. Finally Simon announced there would be no more autotune in X Factor...but the question still remains, what is it and how does it work? So lets explore the most controversial tool in the music producer's arsenal.

Setting the tone

Auto-Tune is in fact the name of a commercial software plug-in developed by Antares Audio Technology in 1997. In theory it looks for off-key notes in either vocal or instrumental tracks, and then slides these bum notes to where they ought to be. The computer science behind this is based on something with a far more interesting name, a phase vocoder. The fundamental idea of this technology is that any sound can be broken down into a sum of pure tones, much like any colour can be created by adding a mixture of red, green and blue. Sound is a pattern of air compressions, and pure tones are sounds that have a regular rhythmic repeating pattern. These air waves are defined by their repeating frequency (literally how often a new wave comes along), their amplitude (which is how strong the waves are) and a phase (when they start and finish relative to other waves).

Break it all down

You can take any sound signal, whether it's a voice or an instrument, and sample it in a computer. The microphone detects the changes in air pressure and turns them into a series of numbers that change over time. Then the cunning processing begins. You can take that bit of complex sound and break it down into its pure tones, so you know their frequency, amplitude and relative phases. So now you've got all the pure tones that make up one complex sound - let's call them A, B, and C. You could build the sound back up again, if you wanted, by adding them together: just add A + B + C and you're back to where you were.

The real X Factor

Let's stick with the bits for a moment though. Each of your pure tones, A, B and C, has a particular frequency and phase. The lowest frequency in the sample, called A here, is known as the fundamental frequency and this is what gives the sound its musical pitch. But suppose that segment of music you've processed is a bum note; it should have been a higher or lower pitch. The computer knows what the sound should have been and can change and slide the components A B and C around to make sure the mix gives the sound you wanted in the first place. So for example, you could slide A back in time with respect to B and C, shifting the relationship between the frequencies. Or, you could replace A with the new correct fundamental frequency - let's call it X. Then the computer system adds X, B and C to produce the pitch-corrected sound in almost real time.

The numbers game

Whether it's music, images or video, once information enters the digital domain it is just a pile of numbers. As such it can be manipulated, improved or faked. What we decide to do with this way to change reality is up to us. It can make the world better, be wonderfully creative or even let some people cheat. Like any technology it's the way we use it that's important.