A magazine where the digital world meets the real world.

On the web

In print

What is cs4fn?

Search:

Picture This? JPEG it!

Looking at a picture from your digital camera or a digital movie, it's all just 11001100011 - hardly inspiring, and I don't really see what it means!

Sequence of images getting worse and worse

The human brain is thought to have around half its volume given over to making sense of vision. A surprising fact perhaps, but it just goes to show how hard understanding the world we see around us is. Scientists the world over are interested in vision. We can try to understand it by looking at the biology of the brain. We can do experiments to try and measure how we go from the image in our eyes to being able to understand what we look at. Computer scientists can also try to build machines that can 'see' to give insight into the way human's do it. If half your brain is needed to see then you can be sure that some fairly hefty calculations are going on in your 'little grey cells' and its making use of lots and lots of information. Information, or data, is something that computer scientists respect. The amount of data needed to accomplish a task determines the amount of calculation needed, and calculations cost, both in the time taken and in the hardware used. The brain obviously does it pretty well. So when computer scientists looked at the problem of making a movie or TV show take up the least possible space on your computer, or of using the least possible amount of data to be transmitted, it's not surprising that they looked to their brains for help.

See it the psychologists way

Psychologists had discovered that human observers are very sensitive to changes in the amount of light in an image (called the luminance), but less so to the changes in colour. This is because our eyes (which turn the light waves from what we are looking at into nerve signals on the retina at the back of your eyes) have two sets of detectors. One is for measuring the amount of light and a separate set help measure the colour of the light. It turns out there are less colour detectors. So when we look at what data we can remove from the image, represented as a stream of ones and zeros, we choose changes in colour. If we make this reduction, by putting in less colour information, our brains don't miss it. Meddle with the luminance and we pick it up easily. We can throw out some colour and our brains don't notice, but psychologists tell us there are some other things we can remove too. We often hear that people don't bother to read the 'small print' in contracts, or that a 'small detail' was easily overlooked. Well our brains do the same with everything we see. Our brains can't read the 'small print' in images. We can take any image and through some clever maths turn it into a 'top ten' of detail. At number one is the pattern of big changes of light over the image, and way down the list are the pattern of how smaller changes in light affect the image. This 'top ten' is called the spatial frequency spectrum of the image. It tells us what patterns at different levels of detail add together to make the original. So with this knowledge we can decide that our image only needs say the top five, and remove the other lower chart (spectrum) entries. Turns out that again our brains won't miss the data. We don�t notice it much, so like colour some levels of detail can be reduced.

Our brains can't read the 'small print' in images.

Leave it Out!

This 'removing things we won't notice' idea is what makes JPEG images work. We can reduce the data for an image by reducing the way we calculate colour changes and changes in level of detail. We can apply these ideas to little blocks of the images. So we take the whole image and break it into bits, and we cut down the data in each bit using our understanding of the brain. What we end up with is not an image but a set of instructions on how to build the image. We send the instructions and when the computer receives them it uses a 'codex', a small program that knows how to turn the instructions for each block into a picture, to recreate the original (well not quite the original but our brains are sufficiently fooled). We can take this removal to the extreme if we want really small amounts of data, or high data compression, but eventually our brains will notice. So it's about understanding what level of removal our brains won�t miss and fixing the minimum amount of data the computer wants to handle. As always it's a trade off, but this trade off is smart.

So the next time you're looking at a digital image think how JPEG is playing tricks on you to create the illusion. What you see is all just 11001100011. The same tricks and more are played when you watch a movie.

More on film

Movie Magic: MPEG it! Phetch: Can you tell what it is yet?

cs4fn : Computer Science for fun