# Hiding in Elizabethan Binary

## by Paul Curzon, Queen Mary University of London

The great Tudor and Stuart philosopher Sir Francis Bacon was a scientist, a statesman and an author. He was also a pretty decent computer scientist. In the year of the Gunpowder plot, he published a new form of cipher, now called Bacon's Cipher, invented when he was a teenager. Its core idea is the foundation for the way all messages are stored in computers today.

The Tudor and Stuart eras were a time of plot and intrigue. Perhaps the most famous is the 1605 Gunpowder plot where Guy Fawkes tried to assassinate King James I by blowing up the Houses of Parliament. Secrets mattered! In his youth Bacon had worked as a secret agent for Elizabeth I's spy chief, Walsingham, so knew all about ciphers. Not content with using those that existed he invented his own. The one he is best remembered for was actually both a cipher and a form of steganography. While a cipher aims to make a message unreadable, steganography is the science of secret writing: disguising messages so no one but the recipient knows there is a message there at all.

## A Cipher ...

Bacon's method came in two parts. The first was a substitution cipher, where different symbols are substituted for each letter of the alphabet in the message. This idea dates back to Roman times. Julius Caesar used a version, substituting each letter for a letter from a fixed number of places down the alphabet (so A becomes E, B becomes F, and so on). Bacon's key idea was to replace each letter of the alphabet with, not a number or letter, but it's own series of a's and b's (see the cipher table). The Elizabethan alphabet actually had only 24 letters so I and J have the same code as do U and V as they were interchangeable (J was the capital letter version of i and similarly for U and v).

In Bacon's cipher everything is encoded in two symbols, so it is a binary encoding. The letters a and b are arbitrary. Today we would use 0 and 1. This is the first use of binary as a way to encode letters (in the West at least). Today all text stored in computers is represented in this way - though the codes are different - it is all Unicode is. It allocates each character in the alphabet with a binary pattern used to represent it in the computer. When the characters are to be displayed, the computer program just looks up which graphic pattern (the actual symbol as drawn) is linked to that binary pattern in the code being used. Unicode gives a binary pattern for every symbol in every human language (and some alien ones like Klingon).

## Steganography

The second part of Bacon's cipher system was Steganography. Steganography dates back to at least the Greeks, who supposedly tattooed messages on the shaved heads of slaves, then let their hair grow back before sending them as both messenger and message. The binary encoding of Bacon's cipher was vital to make his steganography algorithm possible. However, the message was not actually written as a's and b's. Bacon realised that two symbols could stand for any two things. If you could make the difference hard to spot, you could hide the messages. Bacon invented two ways of handwriting each letter of the alphabet - two fonts. An 'a' in the encoded message meant use one font and a 'b' meant use the other. The secret message could then be hidden inside an innocent one. The letters written were no longer the message, the message was in the font used. As Bacon noted, once you have the message in binary you could think of other ways to hide it. One way used was with capital and lower-case letters, though only using the first letter of words to make it less obvious.

Suppose you wanted to hide the message "no" in the innocuous message 'hello world'. The message 'no' becomes 'abbaa abbab'. So far this is just a substitution cipher. Next we hide it in, 'hello world'. Two different kinds of fonts are those with curls on the tails of letters known as serif fonts and like this one and those without curls known as sans serif fonts and like this one. We can use a sans serif font to represent an 'a' in the coded message, and a serif font to represent 'b'. We just alternate the fonts following the pattern of the a's and b's: 'abbaa abbab'. The message becomes

sans serif, serif, serif, sans serif, sans serif,
sans serif, serif, serif, sans serif, serif.

Using those fonts for our message we get the final mixed font message to send: