Tuesday, December 11, 2018

The Algorithm That Saved the Union | Civil War Encryption

OK, maybe that title is a bit click-bait'y. But it's true that the algorithm that powered the North's encryption during the US Civil War no doubt had a significant impact on the outcome of the war. At the core of this algorithm was the use of a route cipher. This cipher is brilliant in its simplicity.

Start with a grid of spaces and a non-obvious route to visit each cell in the grid:

Fill grid with your message, putting one word per cell.

To encrypt the message, traverse the prescribed route, writing down each word as it is encountered:

Note the delightfully jumbled message.

To decode the message, traverse the route again, filling in words as you move from cell to cell. When the grid is fully traversed, the original message will be restored in the grid and can be read as clear text.

The Union Army strengthened this basic cipher with a number of enhancements:

  • Critical words were exchanged with code words. For example, any time the word enemy was to be used wiley was substituted.
  • Each route had a corresponding keyword that itself was included in the message. This allowed lengthy messages to be encrypted using a variety of routes, with the results being concatenated together.
  • Nonsense words could be added in the first and last row of the grid. This resulted in noise being mixed into with the message. When decrypted, the meaningless words were easy to discard.

You can see all these features at work in the example from the article Internal Struggle: The Civil War. The screenshots below show my coding of the routes, dictionary, clear text and coded message.

My implementation of the Union Cypher is on the clunky side. For one thing, I wanted to support more complex routes than simply walking up and down columns. The staton and mcdowell routes for example, call for traversing diagonally from the bottom left hand corner to the top right hand, and then proceeding column-wise. To support this, I describe routes by numbering each cell in the order they are to be visited. This makes for tedious route definition, but it is also quite flexible. You could imagine routes that had a checkerboard shape or other unusual patterns. I've also explicitly included the rows that will contain nonsense words in the grid.

These programming complications are great examples of steps in an algorithm that a human can trivially process but a machine needs to explicitly account for.

Just as interesting as the use of the cypher is the context within which it developed. The Civil War posed a unique military challenge:

The contending forces spoke the same language, shared the same social institutions, including an un-muzzled press and a tendency to express oneself freely on any subject. Military knowledge was shared in common--former classmates at the service academies and peacetime friends would meet in battle. They knew each other’s strengths and weaknesses, and they eagerly devoured reports, in the press and through intelligence sources, of the names of opposing commanders. Each harbored sympathizers with the other side, the basis for espionage and a potential fifth column. Neither inherited any competence in information security nor an appreciation for operational security. Those things would be learned the hard way--the American way--accompanied by bloodshed.

While the Union and Confederacy started from the same point, they made technological choices that ultimately drove how successful they would be in the world of information security. For example, the North's choice of word jumbling over a letter based cypher would have profound impact:

In the North, as telegraphers (frequently little more than teenage boys) were pressed into service and formed into the U.S. Military Telegraph (USMT), a rival of Myer’s signal corps, a word, or route, transposition system was adopted and became widespread. It gave the telegraphers recognizable words, an asset in this early stage of copying Morse “by ear,” that helped to reduce garbles. Code names or code words replaced sensitive plain text before it was transposed, and nulls disrupted the sense of the underlying message. Only USMT telegraphers were permitted to hold the system, thereby becoming cipher clerks as well as communicators for their principals, and the entire organization was rigidly controlled personally by the secretary of war. In the War Department telegraph office near the secretary, President Lincoln was a frequent figure from the nearby White House, anxiously hovering over the young operators as they went about their work.

In the South, although a Confederate States Military Telegraph was organized (in European fashion, under the Postmaster General), it was limited to supplementing the commercial telegraph lines. (“System” would not convey the proper idea, for the Southern lines were in reality a number of independent operations, some recently cut off from their northern ties by the division of the nation and reorganized as Southern companies.) Throughout the war, the Confederate government paid for the transmission of its official telegrams over commercial lines. Initially the Southern operator found peculiar digital texts coming his way (the dictionary system), then scrambled, meaningless letters, begging to be garbled. The poly-alphabetical cipher used for official cryptograms offered none of the easily recognizable words that provided a crutch for his Northern brother.

This is an interesting example of how an apparently less secure system (one where words are kept intact) can ultimately prove to be more valuable than a seemingly more secure one. Or, put another way: never underestimate the impact of human error.

No comments:

Post a Comment