Enter the maze

Picking a conversation out of a crowd

Streams of data converging

Having two ears is definitely better than one. So what would it be like to have 300 ears? Having two allows you to tell where sounds are coming from. A sound that is over to one side takes slightly longer to travel to the far ear than the near one. Your brain can use that information to roughly tie down and so turn to look at the source. Would more ears be better still? Evolution seems to think not - it has tended to favour bigger ears to help animals hear better (think owls, hares and bats) rather than more ears (can you think of any with more than two?) Perhaps that means more ears don't help much. On the other hand it may just be that the cost of having a brain capable of processing all the extra information isn't worth the extra cost.

Ear, Ear

We can of course now find out by giving ourselves the equivalent of more ears if we want to using audio technology. That is essentially what Norwegian Physicists Morgan Kjolerbakken and Vibeke Jahr have done with their invention 'the AudioScope' which uses 300 mikes to do its listening. It turns out that 300 ears does help. What it gives is an ability to pick out a single conversation from the roar of a stadium crowd!

Of course just having 300 mikes isn't, on its own, enough to improve things anymore than having two ears but no brain would help. Your brain does some clever processing to make sense of the two sound sources, and similarly with 300 mikes you need some serious computing power to sort out the conversation from the babble.

The way Audioscope does this is just a more complicated version of what we do. First of all it needs to know where the sound source of interest is. One way to do that is to use a camera to locate the particular sound source - a person perhaps. Some clever software uses the information about what is being focussed on to determine the exact distance the sound of interest is from each of the mikes.

Line them up

The stream of sound at each mike is recorded separately. That means once the distances are known, all the streams can be synchronised so that the sound from that point of interest as heard at each mike is all lined up, even though recorded at different times. Since the sound the person spoke was the same sound when it set off, any sounds that aren't the same at all mikes can be filtered out - they must be coming from somewhere else. What is left is a clear recording of that one conversation.

Look it's me!

Pick out a single conversation from the roar of a stadium crowd

An early use has been to create a conference system for large audiences. Anyone anywhere in a conference hall can now have the floor without a mike having to be passed to them first. There are lots of more glitzy uses though, from allowing TV companies to pick up the conversations of sportsmen on a pitch to another way for governments to spy on each other. It's also easy to imagine what paparazzi reporters will do with the technology. At the moment TV producers at football matches like to home in on interesting people in the crowd (or just because they are pretty and blond). In future they will be able to pick up what they are saying too. Let's just hope it won't just be "Hey, look we're on the big screen!"

If we really did all have 300 ears nothing much would be secret anymore. Perhaps from now on it won't be.