Modular Design of Secure yet Practical Cryptographic Protocols Ronald Cramer Abstract: Historically, cryptology was primarily concerned with developing "secret scripts" (cryptography) and studying their reliability (cryptanalysis). After WWII, largely influenced by Claude Shannon's information theory, cryptology became a science founded on mathematical principles. A second major breakthrough occurred in 1976 with the publication of the article New Directions in Cryptography by Whitfield Diffie and Martin Hellman. They formulated the idea of public key cryptography. Reflecting on public key cryptography as it has since developed, Oded Goldreich described the field as follows: "Cryptography can be considered the science that deals with limiting the effects that can be brought about by dishonest or misbehaving parties." If a comparison with legal practice suggests itself to the reader, I point out that cryptography aims to make the successful deployment of such activities impossible or unfeasible in advance. Strictly speaking, the terms "cryptography," "cryptanalysis," and "cryptology" still refer to design, the study of security, and the combination thereof, respectively (albeit now in a much broader context), but for convenience, I will use "cryptography" here as the collective term. Important concepts in cryptography are authenticity and secrecy of information. The latter term has a very broad meaning here: data of any substantive nature, such as that which can be sent and received over a computer network, is considered "information." In this discussion, a computer network is conceived as a collection of computers or "processors" that can function individually. Communication—that is, the exchange of information—is made possible by a network of connections between these processors. Let us assume that this network is complete, meaning a connection exists between every pair of processors. Suppose a certain processor A receives a "message" m, i.e., a packet of information sent over the connection between processor A and another processor B. Processor A considers m authentic if A can determine with certainty that m indeed originates from B. If the connections between processors are not secured in any way, it is incorrect to establish authenticity based on the "physical" origin of the message: the connection between A and B. After all, the message could also originate from a potentially malicious party C who has gained access to the unsecured connection between A and B and inserted and sent a message there as if it originated from B. Let us now assume, regardless of whether A has established the authenticity of m, that B indeed sent the message, and did so as "clear text." Then we may assume that party C could have intercepted the message. Secrecy of m—meaning the content of the message is known only to the sender and receiver—can then a priori not be guaranteed. Public key cryptography, the principles of which were formulated in 1976 by Diffie and Hellman, offers an elegant mathematical solution to the problems of authenticity and secrecy. Previously, there were already mathematical methods to guarantee authenticity and secrecy, essentially based on bilateral "secret agreements" between every pair of parties deemed to require secure mutual communication. I will first partially explain these. A practical example of such an "encryption method" still in use today that enables secret communication is the Data Encryption Standard (shortened to DES), developed by IBM in the early seventies. It consists of a publicly known encryption and decryption algorithm (an algorithm is a calculation instruction). Every pair of parties in the network together chooses a random (arbitrary) secret key. To do this, they must have a secure channel for the duration of the establishment of these secret, shared keys, for example by using a reliable courier, sealed envelopes, or otherwise physically secured channels. This secret key represents such a large amount of random information that guessing it correctly by a third party has practically no chance of success. Due to the size of the key, it is also practically impossible for the same key to be chosen by more than one pair of parties. A party A encrypts a clear text message m intended for B by calculating the DES function on m and their shared secret key k(A, B). Having received the encryption c, B calls the DES function, with k(A, B) and c as input, to obtain the clear text m. Its secure operation rests on the size of the keys and the fact that the DES algorithm is designed so that, although encryption and decryption can be performed quickly given the key, "cracking" encryptions without knowledge of the secret key used to encrypt the underlying clear text is practically unfeasible. Message authenticity codes can also be based on DES and other so-called common key cryptographic methods as a solution to the authenticity problem. Diffie and Hellman's public key cryptography originates from their attempts to design cryptographic systems that make secret agreements between every pair of parties (as necessary in common key cryptography) redundant: not only as a saving on the amount of work needed to set up the system, but especially because the phase in which these agreements are made can carry great security risks. After all, in that phase, the parties do not yet have a reliable cryptographic method at their disposal. The key to the solution is provided by so-called trapdoor one-way functions; these functions are easy to evaluate for every argument in their domain, but calculating the inverse of these functions on an arbitrary value in their range is unfeasible unless one knows the trapdoor, i.e., the secret key. If sufficiently many such trapdoor one-way functions exist, in Diffie and Hellman's vision, all parties will individually choose an arbitrary function with a corresponding trapdoor. Let's say party A has chosen the function f_A with s_A as the trapdoor. Then A places a description of f_A, A's public key, in a publicly accessible file that is maintained in a reliable manner. A keeps s_A as the secret key. When B wants to send a secret message m to A, B calculates the encryption c as f_A(m) after looking up A's public key in the file and sends it over the (unsecured) connection between A and B. The properties of the functions used then guarantee that only A can decrypt message c, by calculating m = f_A^{-1}(c) using the secret key s_A. So much for the secrecy problem. Another very important contribution from Diffie and Hellman is the digital signature: the digital equivalent of the handwritten signature. A signature must meet two requirements: the signature accompanying a message must guarantee the authenticity of the message, and furthermore, this authenticity must be verifiable by every party. Nowadays, in everyday pen-and-paper practices, the last requirement is not always met in the strictest sense. Digital signatures, on the other hand, demonstrably satisfy both properties. A digital signature system also works according to the public key principle. Let f be a function for which it is unfeasible to evaluate it on any argument from its domain unless one has a secret trapdoor s. Let the signature on a message m be the value \sigma = f(s, m) and let g be a function that determines for a given \sigma, m, and f whether \sigma is indeed the f-image of m. Then the verification of a signature consists of applying g to \sigma, m, and f. Given a large set of such functions f and g, every party A chooses an arbitrary pair (f, g) with the associated trapdoor s. A's public key then consists of f and g, while the trapdoor s forms the secret key. Today, digital signatures play a crucial role in practice, particularly in those electronic information systems where information integrity is of great importance. After Ron Rivest, Adi Shamir, and Leonard Adleman provided a realization of trapdoor one-way functions (the "RSA functions"), based on the computational difficulty of factoring composite integers with only large prime factors, cryptographic research took off tremendously. Without mentioning technical details here, the crux of their idea is that it is relatively easy to select two arbitrary, distinct prime numbers of, say, 100 decimal digits each and calculate their product, whereas when given only that product, it is unfeasible to calculate the prime factors. In public key cryptography, the degree of security of a system is derived from the computational difficulty of number-theoretic problems such as factorization and discrete logarithms, or the existence of functions with properties relevant to cryptography. A security analysis must demonstrate that if a cryptographic system were effectively cracked, one would also have an effective method for solving those very difficult number-theoretic problems. The current state of affairs is that if the problem instances are chosen large enough (for example, factoring arbitrary composite numbers of, say, more than 200 decimal digits is currently unfeasible), every existing mathematical algorithm takes an excessive amount of time to solve them. Moreover, it seems that these problems are intrinsically difficult. In addition to research into public key encryption/decryption and public key digital signatures, for which the theoretical foundations were laid in the eighties, (theoretical) cryptography developed in breadth with research into various classes of one-way functions (functions that are easy to calculate but unfeasible to invert), pseudo-randomness (functions for which it is unfeasible to distinguish the produced output from a random output, even though the size of the input is only a fraction of the size of the output), anonymity in electronic transaction systems (for example, anonymity of the payer in a digital payment system), and so on. Other fundamental research areas within cryptography are secure multi-party computations and zero knowledge protocols. I will now further explain these last two concepts, as they play a major role in this thesis alongside digital signatures. Loosely speaking, in a multi-party computation, a number of parties together calculate a given function (where each of the parties has its own input) so that the outcome becomes known to everyone and the calculation method is resistant to participating parties who "cheat," while the individual input of each of the parties remains hidden from the other parties. An important example is the electronic election. In such a system, for example, all eligible Dutch citizens could fill out an electronic "ballot" in the House of Representatives elections and send it to the authorities. But they would do so in such a way that individual votes remain hidden from the authorities. Only the total election result can be published, and even verified for correctness by everyone. Also, due to the cryptographic method used, falsifying the election result can be made unfeasible or impossible. A zero knowledge protocol is an interactive question-and-answer game between a "prover" (a party who proves) and a "verifier" (a party who verifies). The prover claims that a certain mathematical theorem is valid but wishes not to show the proof of that theorem to the verifier, perhaps because the prover is afraid the verifier will take the credit. In any case, the prover wants to convince the verifier that the theorem is correct. But they want to do so in such a way that the verifier obtains no information from it that would allow them, in turn, to convince a third party of the truth of the theorem. And of course, the verifier does not want to be convinced of a theorem that is incorrect. Zero knowledge protocols offer the possibility of convincing a skeptical verifier without them obtaining any information about the proof. These zero knowledge protocols play an important role in complex cryptographic systems, especially in places where one party must convince another party that, for example, a certain given value was calculated in a pre-arranged way without revealing the precise values underlying the calculation. On the other hand, zero knowledge, along with the related witness hiding, finds direct practical application in access control for, for example, computer networks (note that an approach based simply on passwords does not hold water when access control must be established over an insecure line, and the procedure followed must be interactive in any case to eliminate the possibility of replay attacks). This thesis attempts to bring theoretical cryptography and cryptographic practice closer together. In theory, very elegant solutions are provided for the fundamental cryptographic questions mentioned above. These solutions have the property that security can be reduced by mathematical proof to the security of underlying cryptographic primitives, such as certain types of one-way functions or specific highly complex number-theoretic problems. A disadvantage, however, is that the resulting methods often require too much work (calculation, time, storage, or communication) to be practical. On the other hand, methods directly tailored to practice (and thus "handy") more than once lack clarity regarding their security: the mathematical reductions that must show that a given practical system is secure if, for example, factoring or calculating discrete logarithms are indeed fundamentally difficult problems, are often omitted, simply because it is completely unclear in those cases how such a reduction should proceed. In order to eliminate this discrepancy in a number of important cases, so-called \Sigma-protocols are introduced in Chapter 2: Merlin-Arthur "games" with three moves where the verifier is only expected to send random coinflips ("random numbers"). Typical cryptographic properties required of these will be (a certain form of) zero knowledge and collision-intractability. This last property implies that it must be difficult to calculate two different input values for certain functions derived from the \Sigma-protocols (and under certain conditions) that have the same image value. \Sigma-protocols will form the building blocks for the results in later chapters, but not before a theory has been developed for them that teaches how one can "stack" given building blocks in a practical and reliable way, and how one can practically realize these abstract building blocks. Based on \Sigma-protocols, partial proofs are first introduced in Chapter 3. As an example: with a partial proof, a prover can convince a verifier that of, say, 100 given theorems, at least 50 are true, without giving the verifier any information as to which 50 are true. Much more generally, it is possible to transform zero knowledge, collision-intractable \Sigma-protocols concerning one theorem (randomly chosen from a certain set of theorems) into \Sigma-protocols (with the same cryptographic properties) concerning n arbitrary theorems on which an arbitrary monotonic condition (i.e., a rule that can be given in terms of logical AND and OR operators) has been placed. If a family of such (polynomially bounded) conditions is given, the transformation can still proceed, but under the assumption that a certain type of efficient secret sharing scheme is available. If the problem instances (the "theorems") to which the \Sigma-protocols relate are chosen in the right way from a collection of "very difficult problems," then the resulting scheme is witness hiding: a malicious verifier cannot extract the proofs from the prover. There is a subtle difference between zero knowledge and witness hiding: in the latter case, the verifier cannot extract information that would help this verifier later act as the prover themselves, while in the former case, the verifier learns no information at all (beyond the correctness of the theorem). However, in many cryptographic protocols, witness hiding suffices as a guarantee for its security. Also, witness hiding protocols are generally more efficient than zero knowledge protocols. These partial proofs will then form the basis of a zero knowledge proof system for the circuit satisfiability problem. The result achieves the lowest known communication complexity for this problem, namely a linear number of bit commitments (a bit commitment is calculated by a function whose image value (the commitment) hides the associated original, while it is unfeasible to find two originals that are transformed into the same image by this function) as a function of the size of the circuit in question. Previous solutions required at least a logarithmic factor (in the size of the circuit) more in communication. Since circuit satisfiability is an NP-complete problem (in an NP problem, one must decide for a given set L and a given element z whether that element is in that set, while it is given that, if z \in L, a short proof thereof exists), this method yields a communication-efficient zero knowledge proof system for all "mathematical" theorems that can be conceived as an NP statement. NP statements include, for example, all theorems of the type "Let y be a given value and f an efficiently computable function. There exists an x such that f(x) = y." In complex cryptographic systems, it often happens that a zero knowledge proof for such a type of theorem must be given (where f then represents a one-way function). In practical situations, the efficient method presented in this thesis can yield significant savings on the amount of communication required. Further results in Chapter 3 concern efficient zero knowledge proofs for monotonic closures of languages (sets of bitstrings) that allow a membership proof (a proof that a given element is in the set) of the \Sigma-protocol type. Building on some of the previous results, Chapter 3 concludes with the presentation of a class of digital identification methods resistant to so-called adaptive impersonation attacks, the most powerful attack a malicious party could carry out on an identification method (within the polynomial-time calculation model as handled in public key cryptography), and the most efficient electronic voting system to date. Chapter 4 is exclusively about digital signatures. \Sigma-protocols are used here to develop digital signatures resistant to existential forgery under adaptively chosen message attacks, the highest security level (again within the standard polynomial-time calculation model) of digital signatures. On the one hand, signatures are developed that have the same communication complexity as previous methods with relatively low communication complexity, but under weaker "intractability" hypotheses (assumptions regarding the difficulty of certain computational problems). The thesis concludes with a digital signature system based on RSA. Such systems already existed, but the peculiarity of this scheme lies in the fact that it is the first scheme that is both proven secure (in the strict sense of resistance to adaptively chosen message attacks, and under the assumption that the RSA functions are secure) and practical enough to be performed on, for example, a smart card (a plastic card with an "intelligent" chip on it).