Project Euler: Problem 59 – XOR decryption

Problem 59:

Each character on a computer is assigned a unique code and the preferred standard is ASCII (American Standard Code for Information Interchange). For example, uppercase A = 65, asterisk (*) = 42, and lowercase k = 107.
A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.
For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both “halves”, it is impossible to decrypt the message.
Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.
Your task has been made easy, as the encryption key consists of three lower case characters. Using¬†cipher1.txt¬†(right click and ‘Save Link/Target As…’), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.


Since the problem said the words would be in common English, I had two options after running the decryption: look for words, or analyze letter frequency. Not knowing what kind of words the text would contain, I didn’t think it was feasible to search for a bunch of words in the decrypted text and hopefully hit a match. Instead I opted to analyze letter frequency, so I used the table from Letter frequency Wikipedia page.
I figured, however, that certain decryption keys would yield similar letter frequencies but produce garbled results. I decided I would do a comparison of frequency on, and add up the discrepancy of, each letter to get an overall discrepancy. I stored each overall discrepancy along with the key that produced it, just in case the key with the smallest overall discrepancy did not produce a decrypted text.
Once I had the key that produced the smallest overall discrepancy, I decrypted the text again and printed it to the console to see if that was the proper key, or if I needed to pick the 2nd smallest, or 3rd smallest, and so on.

int answer = 0;

int[] encrypted;
   List<String> tmp = EulerUtils.readFile22("Problem_59.txt");
   encrypted = new int[tmp.size()];
   for(int i = 0; i < tmp.size(); i++) {
      encrypted[i] = Integer.parseInt(tmp.get(i));

int[] cypher = {'a','a','a'};

String easyKey = "";
int[] key = new int[3];
double variance = 1.0;
for(int a = 0; a < 26; a++) {
   cypher[0] = 'a'+a;
   for(int b = 0; b < 26; b++) {
      cypher[1] = 'a'+b;   
      for(int c = 0; c < 26; c++) {
         cypher[2] = 'a'+c;
         int[] decrypted = EulerUtils.decrypt(encrypted,cypher);
         double analyzed = EulerUtils.analyzeLetterFrequency(decrypted);
         if(analyzed < variance) {
            variance = analyzed;
            easyKey = ""+(char)cypher[0]+(char)cypher[1]+(char)cypher[2];
            key[0] = cypher[0];
            key[1] = cypher[1];
            key[2] = cypher[2];
         analysis.put(""+cypher[0]+cypher[1]+cypher[2], analyzed);

System.out.println("EasyKey = "+easyKey+
            ";Total Frequency Difference = "+variance*100.0+"%");
System.out.println("Key: "+key[0]+","+key[1]+","+key[2]);
int[] decrypted = EulerUtils.decrypt(encrypted,key);

for(int i : decrypted)

decrypt() takes the encrypted ASCII code text and XOR’s it with the supplied cypher key.
analyzeLetterFrequency() takes ASCII coded text (int array) and analyzes the overall discrepancy.
asciiToString() simply converts ASCII coded text (int array) to a String.