By evaluating all 25 decrypted messages, we can identify the correct one by examining its resemblance to English text rather than mere random gibberish. Unix/Linux systems come with a dictionary file. I am using Ubuntu and the dictionary locates at /usr/share/dict/cracklib-small
, which contains an extensive collection (54763 words) of English words. To facilitate the process, a bash script named “dictionary_compare” is devised to decrypt all 25 shift values and subsequently compare the decoded words against the dictionary. The incorrect decryptions will yield few or no matches with the words in the dictionary, whereas the correct decryption will exhibit a majority, if not all, of its words being valid English words. The script tallies the number of words found in the dictionary for each decryption attempt and selects the most favorable translation based on the highest count.
#!/bin/bash
load_dictionary() {
dictionary_file="/usr/share/dict/cracklib-small"
if [[ -f "$dictionary_file" ]]; then
cat "$dictionary_file"
else
echo "No dictionary found"
exit 1
fi
}
dictionary=$(load_dictionary)
echo "dictionary loaded"
shift_pattern_script="./shift_pattern"
max_word_count=0
best_shift=0
best_matched_words=""
best_matched_words_with_punctuation=""
if [ $# -ne 1 ]; then
echo "incorrect filename"
exit 1
fi
filename=$1
if [ ! -f "$filename" ]; then
echo "File '$filename' not found."
exit 1
fi
echo "$filename loaded"
for ((shift=1; shift<=25; shift++)); do
matched_words=0
echo "Shift $shift"
pattern_U=$(bash "$shift_pattern_script" -k "$shift" -u)
pattern_L=$(bash "$shift_pattern_script" -k "$shift")
while IFS= read -r line; do
decoded_line1=$(echo "$line" | tr '[A-Z]' "$pattern_U")
decoded_line2=$(echo "$decoded_line1" | tr '[a-z]' "$pattern_L")
decoded_line3=$(echo "$decoded_line2" | tr -d '[:punct:]')
for word in $decoded_line3; do
if grep -q -w "$word" <<< "$dictionary"; then
((matched_words++))
fi
done
done < "$filename"
echo "matched_words found: $matched_words"
if ((matched_words > max_word_count)); then
max_word_count=$matched_words
best_shift=$shift
best_matched_words=$decoded_line3
best_matched_words_with_punctuation=$decoded_line2
fi
done
echo "Best shift: $best_shift"
echo "Matched words: $max_word_count"
echo "Decoded words:"
echo "$best_matched_words_with_punctuation"
Detail explanation:
- The
load_dictionary
function is defined to load a dictionary file. In this script, the dictionary file path is set to/usr/share/dict/cracklib-small
. If the file exists, the function reads its contents using thecat
command. If the file doesn’t exist, it prints an error message and exits the script with a non-zero exit code. - The
dictionary
variable is assigned the contents of the loaded dictionary file by calling theload_dictionary
function. This variable will be used to check if words in the decoded lines exist in the dictionary. - The script checks if exactly one command-line argument (the filename) is provided. If the argument count is not equal to 1, it prints an error message and exits with a non-zero exit code.
- The filename provided as the command-line argument is assigned to the
filename
variable. The script then checks if the file exists. If the file doesn’t exist, it prints an error message and exits with a non-zero exit code. - A loop is initiated to iterate through possible shift values from 1 to 25. Each iteration represents a different shift value.
- Inside the loop, the
matched_words
variable is initialized to 0, which will keep track of the number of words matched for the current shift value. - The script executes the
shift_pattern_script
(another script in earlier blog) with the-k
option set to the current shift value ($shift
) to generate uppercase and lowercase shift patterns. The patterns are stored in thepattern_U
andpattern_L
variables, respectively. - The script starts reading the lines from the file specified by
$filename
in awhile
loop. For each line, it performs the following steps:- It decodes the line by replacing uppercase letters with the corresponding letters from the
pattern_U
variable using thetr
command, resulting in thedecoded_line1
. - It further decodes
decoded_line1
by replacing lowercase letters with the corresponding letters from thepattern_L
variable using thetr
command, resulting in thedecoded_line2
. - It removes punctuation in decoded_line3.
- It iterates over each word in
decoded_line3
using afor
loop. - For each word, it checks if the word exists in the loaded dictionary by using
grep
with the-q
(quiet) and-w
(match whole word) options. If the word is found in the dictionary, it increments thematched_words
counter.
- It decodes the line by replacing uppercase letters with the corresponding letters from the
- After processing all the lines in the file for the current shift value, the script prints the number of matched words found for that shift value.
- It compares the
matched_words
count with the previous maximum count (max_word_count
). If the current count is greater, it updatesmax_word_count
,best_shift
,best_matched_words
and best_matched_words_with_punctuation with the current values. - The loop continues until all shift values from 1 to 25 are processed.
- Finally, the script prints the best shift value (
best_shift
), the number of matched words (max_word_count
), and the decoded words (best_matched_words_with_punctuation
).
cipher.txt
Zjpluapzaz bzl vizlychapvuz myvt aol nyvbuk, hpy, huk zwhjl, hsvun dpao jvtwbaly tvklsz, av tvupavy huk zabkf whza, wylzlua, huk mbabyl jspthal johunl. Jspthal khah yljvykz wyvcpkl lcpklujl vm jspthal johunl rlf pukpjhavyz, zbjo hz nsvihs shuk huk vjlhu altwlyhabyl pujylhzlz; ypzpun zlh slclsz; pjl svzz ha Lhyao’z wvslz huk pu tvbuahpu nshjplyz; mylxblujf huk zlclypaf johunlz pu leayltl dlhaoly zbjo hz obyypjhulz, olhadhclz, dpskmpylz, kyvbnoaz, msvvkz, huk wyljpwpahapvu; huk jsvbk huk clnlahapvu jvcly johunlz.
Result:
./dictionary_compare cipher.txt
dictionary loaded
cipher.txt loaded
Shift 1
matched_words found: 4
Shift 2
matched_words found: 0
Shift 3
matched_words found: 0
Shift 4
matched_words found: 0
Shift 5
matched_words found: 4
Shift 6
matched_words found: 2
Shift 7
matched_words found: 2
Shift 8
matched_words found: 5
Shift 9
matched_words found: 0
Shift 10
matched_words found: 0
Shift 11
matched_words found: 0
Shift 12
matched_words found: 1
Shift 13
matched_words found: 1
Shift 14
matched_words found: 0
Shift 15
matched_words found: 0
Shift 16
matched_words found: 0
Shift 17
matched_words found: 1
Shift 18
matched_words found: 1
Shift 19
matched_words found: 69
Shift 20
matched_words found: 3
Shift 21
matched_words found: 1
Shift 22
matched_words found: 1
Shift 23
matched_words found: 3
Shift 24
matched_words found: 0
Shift 25
matched_words found: 0
Best shift: 19
Matched words: 69
Decoded words:
Scientists use observations from the ground, air, and space, along with computer models, to monitor and study past, present, and future climate change. Climate data records provide evidence of climate change key indicators, such as global land and ocean temperature increases; rising sea levels; ice loss at Earth’s poles and in mountain glaciers; frequency and severity changes in extreme weather such as hurricanes, heatwaves, wildfires, droughts, floods, and precipitation; and cloud and vegetation cover changes.