25 Years of Programming
An open source source for C, C++, OWL, BASIC, MDB, XLS, DOT, and more...
Home   Projects   Up   Sitemap   Search   Blog   Forum+Chat   About Us   Privacy   Terms of Use   Feedback   FAQ   Images   Services   Payments   Humor   Music

Solve substitution ciphers, GNU g++ C++ (Decipher)

This program provides interactive assistance with solving a substitution cipher. In its interface you can quickly swap letter translations and see the results immediately in the text. There is a more complete description on the page for the Visual C++ versions of the cipher/decipher programs, which has the companion enciphering program. One version of it works in either Windows or Linux.

One version of Decipher could also serve both operating systems, except that I wanted to preserve the use of color in the display, and the methods in Windows (.NET) and Linux are quite different. The Linux method used here (raw ANSI escape sequences sent to the terminal) is basic and not very portable. It would be better to use the ncurses library, but I don't know how yet, so the current method must suffice for now.  

In addition to the source code on this page, the program also requires my.h and mylib.cpp.

My most recent version, which I think is better than the C++ ones, is on a puzzle page where you can create and solve Caesar and other substitution ciphers online. It uses JavaScript versions of the encipher and decipher programs.

Screenshot of the decipher program

Screenshot (from the Windows version) showing letter frequencies and the text being decoded. Locked letters are green.

Screenshot of Decipher.cpp console.

DECIPHER

/*	DECIPHER.CPP			1-26-2010		GNU GCC g++
	Copyright (C)1994-1996, 2007, 2010 Steven Whitney.
	Initially published by http://25yearsofprogramming.com.

	This program is free software; you can redistribute it and/or
	modify it under the terms of the GNU General Public License (GPL)
	Version 3 as published by the Free Software Foundation.
	This program is distributed in the hope that it will be useful,
	but WITHOUT ANY WARRANTY; without even the implied warranty of
	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
	GNU General Public License for more details.
	You should have received a copy of the GNU General Public License
	along with this program; if not, write to the Free Software
	Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.

Assists with solving substitution cipher puzzles such as are sometimes published in newspapers and
game magazines. A substitution cipher is text in which every letter A-Z has been transposed to some other
letter A-Z. The text looks like nonsense, but if you can determine which letter in the
ciphered text stands for which letter in the alphabet, you can restore the text to its
original state.

The program loads the source text file you specify on the command line, then optionally does a quick
preliminary decode in which the most common letters in the text are translated to the most
commonly occurring letters in most English texts. It may or may not be correct, but it's
a start. Then it provides an interface where you can easily and quickly swap letter
translations and see the results immediately in the text. This is a lot easier than doing
it on paper, but of course if you use this program you are cheating!

When swapping letters, enter S (for Swap), then the letter that you think is wrong,
then the letter you think it should be. (Order of the swapped letters doesn't matter.)

-----
TO DO

Revised letter frequencies, from a TV show: ETAONI....JKXQZ.
TH is the most common letter pair. EA is the most common vowel pair.

Could be automated, checking the resulting words against a dictionary,
and swapping letters until they are all known words.

*/
#include "../include/mylib.cpp"

// The color usage is experimental and only works on an ANSI terminal.
// See code comments for an alternative that uses upper/lower case instead of color.
// As-is, the program uses both.
// Colors are documented in 'man screen'. Program should use ncurses instead, however.
// BLACK didn't work well, so the code 49 is actually for "background = default".
#define WHITEONBLACK	"\033[22;37;49m"
#define GREENONBLACK	"\033[22;32;49m"
#define REDONBLACK		"\033[22;31;49m"

using namespace std;

//////////////////////////////////////////////////////////////////////////////
// GLOBAL VARIABLES
//----------------------------------------------------------------------------
struct translate		// TRANSLATION TABLE
{
	char orig;			// LETTER FROM INPUT TEXT
	char trans;			// LETTER IT'S TRANSLATED TO
	int freq;			// FREQUENCY OF ORIG IN INPUT TEXT
	char lock;			// PREVENTS TRANS FROM BEING CHANGED
} letter[26];

string intext;			// INPUT TEXT

//----------------------------------------------------------------------------
// DETERMINE LETTER FREQUENCIES
void freqtally()
{
for(uint i = 0 ; i < 26 ; i++)					// FOR EACH LETTER,
	for(uint j = 0 ; j < intext.length() ; j++)	// SEARCH FOR IT,
		if(intext[j] == letter[i].orig)			// AND, WHEN FOUND,
			letter[i].freq++;					// INCREMENT COUNT
}
//----------------------------------------------------------------------------
// INITIALIZES TRANSLATION TABLE
void initable()
{
string source("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
for(int i = 0 ; i < 26 ; i++)
{
	letter[i].orig = source[i];
	letter[i].trans = source[i];
	letter[i].freq = 0;
	letter[i].lock = false;
}
}
//----------------------------------------------------------------------------
// GETS INPUT TEXT FROM DISK FILE
// RETURNS true IF OK, false if not
bool readtext(const char *filename)
{
ifstream infile(filename);
if(!infile)
{
	cerr << "Error reading file, or not found." << endl;
	return(false);	
}
intext.clear();
stringfromfile(intext, infile);
toupper(intext);
return(true);
}
//----------------------------------------------------------------------------
// DISPLAY TEXT, AS TRANSLATED, and optionally also write to disk file.
// This looks kind of messy, but putting it into 2 different functions
// didn't look much better.  It would probably be ok to just use one
// function, with the constream manipulators, and let them be ignored
// when writing to a disk file.  Could also break some of the sections here
// into separate functions (showtranslations(), showfreq(), showastranslated())
void showtrans(bool tofile)  // FLAG WHETHER TO also WRITE TO DISK FILE
{
ofstream outfile;
if(tofile)
	outfile.open("deciphed.txt");

char ch;
uint i;
string color;
for(i = 0 ; i < 26 ; i++)				// SHOW CURRENT CODE SCHEME
{
	ch = toupper(letter[i].trans);

	// UPPER/LOWERCASE IS MORE PORTABLE THAN COLOR
	if(!letter[i].lock)
		ch = tolower(ch);

	// AND/OR DIFFERENT COLOR IF LOCKED. 
	color = (letter[i].lock ? GREENONBLACK : WHITEONBLACK);
	cerr << color;

	cerr << setw(3) << setiosflags(ios::left) << ch;
	if(tofile)
		outfile	<< setw(3) << setiosflags(ios::left) << ch;
}
cerr << endl;
if(tofile)
	outfile << endl;

cerr << REDONBLACK;

for(i = 0 ; i < 26 ; i++)					// SHOW LETTER FREQUENCIES
{
	cerr << setw(3) << setiosflags(ios::left) << letter[i].freq;
	if(tofile)
		outfile	<< setw(3) << setiosflags(ios::left) << letter[i].freq;
}
cerr << endl << endl;
if(tofile)
	outfile << endl << endl;

for(i = 0 ; i < intext.length() ; i++)		// FOR EACH CHAR IN INTEXT,
{
	bool found = false;
	for(uint j = 0 ; j < 26 ; j++)			// LOOK IT UP IN TABLE
		if(intext[i] == letter[j].orig)		// IF FOUND, IT'S A LETTER
		{                                  	// SHOW ITS TRANSLATION
			ch = toupper(letter[j].trans);

			// UPPER/LOWERCASE IS MORE PORTABLE THAN COLOR
			if(!letter[j].lock)
				ch = tolower(ch);

			// AND/OR DIFFERENT COLOR IF LOCKED. 
			color = (letter[i].lock ? GREENONBLACK : WHITEONBLACK);
			cerr << color;

			cerr << ch;
			if(tofile)
				outfile << ch;
			found = true;
			break;							// BREAK LOOP TO DO NEXT CHAR
		}
	if(!found)								// OUTPUT NON-LETTER AS-IS
	{
		if(intext[i] == '\t')				// expand tabs to spaces
			cerr << "    ";				
		else
		{
			cerr << WHITEONBLACK;
			cerr << intext[i];
		}
		if(tofile)
			outfile << intext[i];
	}
}
cerr << endl;
if(tofile)
	outfile << endl;
}                      // end showtrans()
//----------------------------------------------------------------------------
// SWAP TRANSLATIONS FOR 2 LETTERS
// BEFORE: ZBRUET   AFTER: ZBRUET  <- LETTER[].ORIG, NEVER CHANGES
//         ZBRUET          ETRUZB  <- LETTER[].TRANS
// RETURNS true IF OK, false IF SWAP DISALLOWED
int swaplets(char a, char b)	// LETTERS WHOSE TRANSLATIONS ARE TO BE SWAPPED
{
a = toupper(a);
b = toupper(b);

int adex, bdex;				// adex and bdex are INDEXES OF A AND B IN LETTER[].TRANS
for(adex = 0 ; (adex < 26) && (letter[adex].trans != a) ; adex++); // LOCATE A
for(bdex = 0 ; (bdex < 26) && (letter[bdex].trans != b) ; bdex++); // LOCATE B

if((letter[adex].lock) || (letter[bdex].lock))	// TEST FOR LOCKED
	return(false);

char ch = letter[adex].trans;					// SWAP THEIR TRANSLATIONS
letter[adex].trans = letter[bdex].trans;
letter[bdex].trans = ch;
return(true);
}
//----------------------------------------------------------------------------
// INITIALLY ASSIGN TRANSLATIONS FOR MOST FREQUENT LETTERS
// This doesn't always result in the assignments I expect, maybe because of the letter locking.
void etaoin()
{
// ASSIGN MOST FREQUENT LETTER TO 'E', TEMPORARILY LOCK IN PLACE,
// 2ND MOST FREQUENT = 'T', ETC.
swaplets(letter[0].orig,'E'); letter[0].lock = true;
swaplets(letter[1].orig,'T'); letter[1].lock = true;
swaplets(letter[2].orig,'A'); letter[2].lock = true;
swaplets(letter[3].orig,'O'); letter[3].lock = true;
swaplets(letter[4].orig,'I'); letter[4].lock = true;
swaplets(letter[5].orig,'N'); letter[5].lock = true;
swaplets(letter[6].orig,'S'); letter[6].lock = true;
swaplets(letter[7].orig,'H'); letter[7].lock = true;
swaplets(letter[8].orig,'R'); letter[8].lock = true;
swaplets(letter[9].orig,'D'); letter[9].lock = true;
swaplets(letter[10].orig,'L'); letter[10].lock = true;
swaplets(letter[11].orig,'U'); letter[11].lock = true;
					// SHUFFLE THESE SELDOM-USED TO END OF ARRAY, IF POSSIBLE
swaplets(letter[25].orig,'Q'); letter[25].lock = true;
swaplets(letter[24].orig,'J'); letter[24].lock = true;
swaplets(letter[23].orig,'X'); letter[23].lock = true;
swaplets(letter[22].orig,'Z');

for(int i = 0 ; i < 26 ; i++)			// RE-UNLOCK THEM ALL
	letter[i].lock = false;
}
//----------------------------------------------------------------------------
// SORT ARRAY IN ORDER OF DECREASING FREQUENCY
int sortlets(struct translate *l, struct translate *r)
{
	return(r->freq - l->freq);		// a reverse-sort (backwards)
}
//----------------------------------------------------------------------------
//	MAIN
int main(int argc, char** argv)
{
cerr << "Interactive assistance with solving a substitution cipher." << endl;
cerr << "Usage: decipher infile" << endl;
cerr << "Output file is always deciphed.txt" << endl;

if((argc < 2) || !readtext(argv[1]))
{
	cerr << "No text to decipher." << endl;
	return(0);
}
toupper(intext);
initable();							// PUT LETTERS IN THE TRANS. TABLE
freqtally();						// TALLY LETTER FREQUENCIES
									// SORT BY DECREASING FREQUENCY
qsort(letter,26,sizeof(letter[0]),(int (*)(const void *,const void *)) sortlets);

if(yesno("Automatically assign ETAOIN... to most frequent letters?",cerr))
	etaoin();						// ASSIGN ETAOIN TO MOST FREQ.

uint i, j;
string buf;
char ch;
while(1)							// START PROCESSING
{
	showtrans(false);				// DISPLAY TEXT
	cerr << WHITEONBLACK;

	cerr << "<S>wap, <L>ock, <U>nlock, <D>one: ";
	cin >> ch;
	ch = tolower(ch);
	getline(cin,buf);	// must discard remainder including \n or it causes next getline() to fail
	cerr << endl; 
	switch(ch)
	{
		case 's':							// SWAP LETTERS
			do
			{
				cerr << "Swap: ";
				getline(cin,buf,'\n');
			}
			while(buf.length() < 2);		// C/R = ABORT
			if(!swaplets(buf[0],buf[1]))
				putchar(7);					// A LETTER IS LOCKED - BEEP
			cerr << endl;
			break;
		case 'u':                           // UNLOCK LETTERS
		case 'l':							// LOCK LETTERS
			cerr << ((ch == 'l' ? "Lock letter(s): " : "Unlock letter(s): "));
			getline(cin,buf,'\n');
			if(!buf.length())				// C/R = ABORT
				break;
			toupper(buf);
			for(j = 0 ; j < buf.length() ; j++)
				for(i = 0 ; i < 26 ; i++)
					if(letter[i].trans == buf[j])
					{
						letter[i].lock = (ch == 'l' ? true : false);
						break;
					}
			cerr << endl;
			break;
		case 'd':              				// write to disk and optionally quit
			showtrans(true);
			cerr << "Buffer written to deciphed.txt." << endl;
			cerr << "Press Q to quit, anything else to resume." << endl;
			if(tolower(getchar()) == 'q')
				return(0);
			cerr << endl;
			break;
		default:
			break;
	}					// END SWITCH
}						// END WHILE(1)
return(0);
}							// end main

CIPHERED.TXT

A sample CIPHERED.TXT file for testing.

BZAE AE GSEB U BOEB
BD EOO ZDW BZO
VCDLCUI WDCQE.

 

Valid HTML 4.01 Transitional Valid CSS
Yahoo! Search
Search the web Search this site
View content labeling at ICRA.