CryptoPals is a popular set of problems that can be used as an alternative way to learn about cryptography, how to implement and even how to break it. There are 8 sets of problems to get through and each set has approximately 8 challenges to solve.
From the website… ‘This is a different way to learn about crypto than taking a class or reading a book. We give you problems to solve. They’re derived from weaknesses in real-world systems and modern cryptographic constructions. We give you enough info to learn about the underlying crypto concepts yourself. When you’re finished, you’ll not only have learned a good deal about how cryptosystems are built, but you’ll also understand how they’re attacked.’
I’ve decided to have a go at solving these challenges and have recently completed the first set, so here are my solutions to CryptoPals Set 1 in Rust.
Solutions to Set 1 – Basics
Here are the solutions to the first 8 CryptoPals challenges which are a part of Set 1. Set 1 covers some basic foundational topics that are essential knowledge in order to implement and code cryptography. I’ve provided a brief explanation of each problem and the solution to give you some background and context. You can find all my solutions to these problems on Github here.
Challenge 1 – Convert hex to base64
The goal for the first challenge was to implement a solution to convert a hex encoded string to a base64 encoded string. The intention is that the hex string should first be converted to raw bytes and then the bytes converted to base64. This is important because generally cryptography algorithms operate on raw bytes and not strings. The hex and base64 string formats are mainly used for transfer or storage of data in a human readable format or for pretty printing.
It’s important to understand how each data type is stored in memory to better understand what it means to convert hex to bytes, bytes to base64 and vice versa. Strings in Rust are encoded in UTF-8 which is equivalent to ASCII for the alpha numeric characters. Using this format strings are encoded in a way where each character is represented by a designated number in memory. Converting the hex string to raw bytes means that we are decoding the hex characters and converting them to the number which the hex characters represent and then returning the number in binary format as a byte array. The reverse is to encode the binary number/data in the byte array as a hex string which represents the number.
See my full solution to this problem on Github here.
Challenge 2 – Fixed XOR
The goal of this challenge was to create a function to XOR two byte arrays together (of the same length) and return the result as another byte array. XOR is used everywhere in cryptography because of its security properties. Basically it can be used to encrypt data is a secure way as long as the key material is never used more than once. This form of encryption is called the one time pad.
In the code example I first convert the hex strings to bytes and then run the two input byte arrays through the XOR function which builds a new byte array by doing a bitwise XOR on each byte of two input arrays and returning the new array.
See my solution to this problem on Github here.
Challenge 3 – Single-byte XOR cipher
In this challenge we are given a ciphertext in hex and are told that it has been encrypted using XOR with a single character. In other words the key used is just a single character repeating. The challenge suggests looking at the frequency of characters used in the English language. It’s also worth noting that because the key is a single character then we know that the key space is 256 (2^8) because a character uses 8 bits in memory (assuming ASCII).
The approach used in my solution loops through each of the 256 possible keys, decrypts the ciphertext with the given key and then assesses the probability that the result is English text. We can use XOR to decrypt a ciphertext that was encrypted with XOR so we use the solution from Challenge 2 for this part. Each candidate plaintext is scored by comparing the expected frequency of the most common English characters to the actual frequency in the plaintext and a score is calculated by summing the difference of the actual minus the expect frequency for each letter. The lower the score the better and the plaintext candidate with the lowest score is returned and selected as the result which turns out to be the correct plaintext input. We also print the repeating key that was used to encrypt and decrypt along with the plaintext.
See my solution to this problem on Github here.
Challenge 4 – Detect single-character XOR
This problem is similar to Challenge 3 except that we are given a list of ciphertexts rather than just one and we need to determine which of the ciphertexts was encrypted with the single character repeating key XOR.
To solve this we load in the hex strings as a list and the map each to raw bytes using the hex to bytes function from Challenge 1. Then for each ciphertext we run the code from Challenge 3 which returns a score for that ciphertext and the best candidate key and plaintext. We then find the lowest score over all the ciphertexts and return it as the solution along with the key and the decrypted plaintext.
See my solution to this problem on Github here.
Challenge 5 – Implement repeating-key XOR
The goal of this challenge is to encrypt a plaintext using repeating key XOR. We are given a plaintext and told to encrypt it using a given key that repeats.
In my solution I have created a function that creates a repeating key which can then be used to XOR against the plaintext. The function takes the input key which needs to repeat and an output array which is filled with the key multiple times up to the length of the output array. After generating the repeating key using the function I then XOR the input plaintext given in the problem with the key to produce the ciphertext.
See my solution to this problem on Github here.
Challenge 6 – Break repeating-key XOR
The goal of this challenge is to break repeating key XOR where we don’t know the length of the key used. We are given a ciphertext and told to decrypt it. The steps needed to get closer to solving the problem are given in the question and these are summarised below.
We start by creating a function that can calculate the edit distance (also known as hamming distance) between two byte arrays which is basically the number of differing bits. To implement this function we XOR the two input arrays together and then count the bits set to one in the resulting array.
Next we calculate a normalised edit distance for each possible key size (between 2 to 40 bytes). The normalised edit distance is calculated by dividing the edit distance by the key size. I eventually used first 4 key size slices of the ciphertext as input into this calculation and averaged the normalised edit distances to get a more actuate result. The key size that gives the lowest average normalised edit distance is the key size we use with the rest of the solution.
Now we use the selected key size to divide the ciphertext into blocks with length equal to the key size. Then we transpose the blocks by putting the first byte of each block into an array and then the second byte of each block into another array and so on. Then we use the solution from Challenge 3 to find the best candidate key for each of the transposed blocks. The idea here is that because the key is repeating, each of the bytes of the transposed blocks will have been encrypted with the same key.
After finding the key for each transposed block we concatenate each (partial key) together to produce the full key. We then use the repeating key function from Challenge 5 to build the full repeating key and then XOR it against the ciphertext to get the plaintext and solve the challenge.
See my solution to this problem on Github here.
Challenge 7 – AES in ECB mode
In this challenge we are asked to decrypt a given ciphertext that has been encrypted with AES ECB. The encryption key used is provided in the question and it is suggested to use OpenSSL in the solution.
For my solution I simply imported the Rust bindings to OpenSSL and then used the openssl::symm::decrypt
function passing in the Cipher::aes_128_ecb()
cipher to decrypt the ciphertext. The plaintext bytes are then converted to a string and printed.
See my solution to this problem on Github here.
Challenge 8 – Detect AES in ECB mode
For this problem we are simply asked to detect which ciphertext in a list has been encrypted with AES ECB. Note that we don’t need to actually decrypt it. We are given the hint that AES ECB is stateless and deterministic so we know that a given plaintext input will always produce the same ciphertext. We also know that AES uses a block size of 16 bytes for encryption and decryption and so we can use this fact to detect ECB by looking for repeating blocks of ciphertext given that it is statistically likely for blocks of the ciphertext to repeat given that the ciphertext is long enough.
My solution to this problem works by looping through each ciphertext and breaking them into blocks of 16 bytes. We then insert each block into a set and stop when we detect the first duplicate block. We use a set type in Rust that returns false when inserting an element into the set that already exists. We then print the line number and the cipher text when the duplicate block is detected. It turns out that this method correctly identifies the ciphertext which was encrypted with AES ECB.
See my solution to this problem on Github here.
Conclusion
So there you have it, that covers the solutions to CryptoPals Set 1 with explanations. These problems cover some foundational cryptography concepts. Most of the questions are straight forward to solve but still are a valuable exercise to ground your knowledge in the basics of coding cryptography.