Cracking Viginere Cipher

Vigenere Cipher

Viginere cipher is a polyalphabetic cipher that encrypts by setting a period based on the key length. Therefore, when the cipher was first developed it was assumed that it would be safe from frequency attacks. However, when an attacker can guess the key length, it is no longer safe from frequency attacks.

Guessing the Key Length

Since Vigenere CIpher is a polyalphabetic cipher, it normally creates different cipher text for the same plain text. For instance, the period of the picture is 6, so the key will be reused once it reaches 6. Therefore, the critical problem is that if the text is located at the same index and it has the same index, it will create the same cipher char.

For instance, in the case of the picture, the first T was ciphered to V, and T from “TMESSA” was also ciphered to “V”.

By these characteristics, an attacker can predict the key length and perform frequency attacks.

Based on this, I wrote a Python code that automatically predicts the key length:

def guess_key_length(encrypted_text):
    text_list_org = list(encrypted_text)
    text_list_shifted = text_list_org
    coincidence_list = list()
    coincidence_temp = 0
    for i in range(1, 20): # Calculate coincidence for each key length
         text_list_shifted = np.roll(text_list_shifted, 1)  # Shift the text by 1
         for i, j in zip(text_list_org, text_list_shifted):
              if (i == j): # If the letters match, increment coincidence
                coincidence_temp += 1
         coincidence_list.append(coincidence_temp)
         coincidence_temp = 0
    
    sorted_list = sorted(coincidence_list, reverse = True)
    result_list = list()
    result_list.append(coincidence_list.index(sorted_list[0]) + 1)
    result_list.append(coincidence_list.index(sorted_list[1]) + 1)
    result_list.append(coincidence_list.index(sorted_list[2]) + 1)
    result_list.append(coincidence_list.index(sorted_list[3]) + 1)
    result_list.append(coincidence_list.index(sorted_list[4]) + 1)
    result_list.append(coincidence_list.index(sorted_list[5]) + 1)
    return result_list

Guessing the Real Key

Now we know the possible key lengths, the only thing left to crack vigenere cipher is getting the key. This is possible by frequency attack.

def get_possible_keys(encrypted_text, key_length):
    num = {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i', 9: 'j', 10: 'k', 11: 'l', 12: 'm', 13: 'n', 14: 'o', 15: 'p', 16: 'q', 17: 'r', 18: 's', 19: 't', 20: 'u', 21: 'v', 22: 'w', 23: 'x', 24: 'y', 25: 'z'}
    alphabets = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
    alphabets_freq = [0.08167, 0.01492, 0.02782, 0.04253, 0.12702, 0.02228, 0.02015, 0.06094, 0.06966, 0.00153, 0.00772, 0.04025, 0.02406, 0.06749, 0.07507, 0.01929, 0.00095, 0.05987, 0.06327, 0.09056, 0.02758, 0.00978, 0.0236, 0.0015, 0.01974, 0.00074]
    encrypted_text_list = list(encrypted_text.lower())
    #print(encrypted_text_list)
    v1 = 0
    blocks = [[] for x1 in range(0, key_length)] 
    while v1 < key_length:
        for i2 in range(v1, len(encrypted_text_list), key_length):
            blocks[v1].append(encrypted_text_list[i2]) # i2 += key_length
        v1 += 1
   # print(blocks) # I made a blocks that groups the ciphers that were encrypted with same key
    v1 = 0
    key_array = ""
    while v1 < key_length:
        frequnecy_list = list()
        for alphabet in alphabets:
            freq_temp = blocks[v1].count(alphabet)
            freq_temp = freq_temp / 26 # converting frequency to 0 ~ 1
            freq_temp = round(freq_temp, 7)
            frequnecy_list.append(freq_temp)
    
        maximum_shift = 24
        correlation_scores = list()
        t = 0
        while maximum_shift >= 0:
            shifted_alphabtes_freq = shift_left(alphabets_freq, t)
            correlation_score = np.dot(frequnecy_list, shifted_alphabtes_freq) #calculating similarity by dot product
            #https://math.stackexchange.com/questions/689022/how-does-the-dot-product-determine-similarity
            correlation_score = round(correlation_score, 6)
            correlation_scores.append(correlation_score)
            maximum_shift -= 1
            t += 1
            
        max_correlation_score = max(correlation_scores)
        F = [D for D, E in enumerate(correlation_scores) if E == max_correlation_score]
        F[0] = ((26 - F[0]) % 26)
        key = num[F[0]].upper() 
        key_array += key
        v1 += 1
return key_array

The code conducts frequency attack and gives possible keys. The similarity calculation is based on dot productivity.

Result

Successfully cracked the cipher!

https://github.com/dslee2022/Viginere-Cipher-Cracker