Reconstructing the Cryptanalytic Attack behind the Flame Malware Maximilian Johannes Fillinger Abstract: Flame was an advanced malware, used for espionage, which infected computers running a Microsoft Windows operating system. Once a computer in a local network was infected, Flame could spread to the other computers in the network via Windows Update, disguised as a security patch from Microsoft. Windows Update relies on digital signatures to ensure that updates originate from Microsoft. Using an attack on the cryptographic hash function MD5, the attackers forged a Microsoft signature for a certificate which allowed them to produce signed updates accepted by Windows Update. Using a technique dubbed counter-cryptanalysis, it was found that the certificate was generated by a chosen-prefix collision attack on MD5, i.e., an attack that extends two prefixes P and P with suffixes S and S such that P S and P S have the same hash value: a collision. Although the attack seems to be based on the same principles as published collision attacks, such as the attack by Wang et al. and the attack by Stevens et al., it has not been published before. The hash function MD5 splits its input into 512-bit blocks. MD5 processes these blocks with a so-called compression function while updating an intermediate hash value. The intermediate hash value after the whole input has been processed is the output of MD5. The Flame chosen-prefix collision attack, like other attacks of this type, consists of the following steps: It begins with a so-called Birthday Search which extends the chosen prefixes such that the difference in the intermediate hash values has a specific form. Then, a series of near-collision blocks is constructed such that after these blocks have been processed, the intermediate hash value difference is 0. Thus, a collision is achieved. The construction of these blocks is based on differential paths, systems of equations that specify ways in which input differences in the compression function can propagate. In chosen-prefix attacks, it is necessary to algorithmically generate such differential paths. The goal of this thesis is to work towards a reconstruction of the collision attack that generated the Flame authors~ certificate, based on the differential paths of the near-collision blocks in the certificate, which were extracted by Stevens. Our main contribution is an attempt at reconstructing the possible end-segments of the differential paths used in the attack. These end-segments allow us to infer the cost of constructing the near-collision blocks. Also, we can infer what intermediate hash value differences the attackers looked for in the Birthday Search. This allows us to give an estimate of the Birthday Search cost as well. We prove that, assuming our reconstruction is correct, the expected cost of the attack was equivalent to at least 246.6 calls to the MD5-compression function. We also show that, using the right parameters, this cost can be achieved. In comparison, the attack by Stevens et al. has an expected cost of 244.55 for suitably chosen parameters when we require that, as in the Flame attack, only four near-collision blocks can be used. We argue that it is very likely that the expected cost of the Flame attack exceeds our lower bound. A likely reason for this higher cost is that the attackers optimized the attack for speed on massively parallel architectures, not for theoretical cost. The Birthday Search is more cost-effective to parallelize than the construction of the near-collision blocks and it is possible to decrease the cost of constructing near-collision blocks by increasing the Birthday Search cost. Furthermore, we analyzed the connection step of the Flame attack: The most expensive part of the differential path construction algorithm is the connection of an upper and a lower differential path. Not all pairs of upper and lower paths can be connected. Compared to the attack by Stevens et al., the number of pairs that have to be examined increases by a factor of 213. So-called tunnels are an important technique for increasing the speed of the near-collision block construction algorithm. The speed-up that a tunnel offers is determined by its strength. As Stevens shows, tunnels were used in the collision attack, but their strengths were not maximized. As an explanation, he proposes that the attackers used tunnels when they were available, but did not actively try to increase their strength. We show that this explanation does not account for the observed tunnel strengths and offer an alternative hypothesis. Unfortunately, this hypothesis is less straightforward and impossible to test conclusively with only one output sample of the attack.