Advanced cryptography: Symbolic methods for security analysis

The Dolev-Yao model

[DY] Dolev, Yao, On the security of public key protocols, IEEE TIT 29(2):198-208 (1983)

The need for a formal for security protocol analysis is apparent in the Needham and Schroeder paper, and other papers of the time proposing cryptographic protocols to achieve various security tasks. In [DY] Dolev and Yao introduced such a formal model. The model of [DY] is very constrained, and does not allow to describe many interesting protocols, but it is nevertheless a seminal contribution for the following reasons:

It is the first paper providing a formal model.
The restrictions are mostly on the honest protocol partecipants and security goal. The adversarial model is still quite general. The implication is that, provided the protocol and security goal fits their model, one can use the model to prove strong security guarantees against general (symbolic) adversaries.
They show that by restricting the class of target protocols one can obtain interesting algorithmic results, i.e., the security of a protocol becomes decidable in polynomial time and it can be automated.

Before describing the model of Dolev and Yao, let us pin point some of the important features of what we mean by formal model. A properly defined formal model for security analysis should include the following:

A precise langauge for the description of protocols.
A formal execution model that describes how the protocol is run, possibly in the presence of an adversary. This includes a description of the adversary's capabilities, e.g., the capability of starting the execution of arbitrary instances of the protocol, either among honest players, or between honest players and the adversary.
A formal language for specifying desired security protocols.

Notice that all these features are important to unambiguously interpret questions like: is this protocol secure or not? Without a precise description of the protocol, how the protocol is executed, and what constitute a security violation, the question is ambiguous. Notice that computational cryptography has similar features, but as a result of the complexity of the computational model, cryptographers typically (or, at best) can give only an informal description of a formal execution model, ambiguous at times, and certainly not suitable for automatic analysis. If you want to give the protocol and security goal as input to a computer program, then it is of primary importance to have an appropriate language for the description of both.

Now let us get to the model of Dolev and Yao. The protocols and security properties are not as general as those considered in [NS], but a clear execution model and security property specification language is given. Moreover, the protocol model is sufficiently general to capture interesting examples, like the public key protocol of [NS] (but not the security property of interest in [NS]). The focus in [DY] is on two party protocols and secrecy properties. Here are some salient features of the Dolev-Yao model:

Secrecy properties: Alice is given a message M as input, and starts exchanging messages with Bob, with the goal of sending message M to Bob in a secret way. The security property is that an adversary cannot recover M, even if actively interfering with the protocol. No other security properties are considered.
Stateless parties: The main limitation on the honest parties is that they are stateless: the messages transmitted by a party at every step of the protocol are a function of theier initial knowledge and the message they just received. In particular, parties cannot use information collected from previous messages. For this reason, these protocols have been named "ping-pong" protocols. We observe that the stateless restriction is only put on the honest parties, and the adversary can mainain state, record communications, and store values that are subsequently used in the construction of messages. The stateless restriction is not too bad for the honest parties and corresponds to many practical situations and real protocols: e.g., a server that uses cookies to mainain the state of a user session without the need to store state information for every client.
Concurrent execution: The adversary can start an arbitrary number of protocol executions, involving different sets of parties, where each player can partecipate in several concurrent executions. In this respect, the model considered here is more general than the computational model considered at the time, which focused on single protocol execution. The computational cryptography community started addressing the important issue of concurrency only in the 90's.
Public key cryptography and infrastructure: It is assumed that a public table (X,Ex) containing the name and public key of every user is publicly available. The initial knowledge of each user consists of this table, plus the user secret decryption key Dx.

Examples

Before describing the actual model, we consider a few simple example protocols.

Example 1: Alice sends and encrypted message to Bob, and waits for an echo message sent back in acknowledgment.

A -> B: {M}_B
B -> A: {M}_A

Above we are using the following notational conventions for the description of protocols. The protocol is specified as a sequence of message exchanged during a normal execution. Each message includes the name of a sender and recepient, but these can be altered/spoofed by the adversary, i.e., the adversary can intercept messages before they reach their intended destination, it can modify and reroute them, possibly with invalid sender fields. The protocol informally specifies two programs, executed by the two players.

The above protocol is insecure because it is subject to the following attack:

The adversary Z starts a protocol execution between A and B with unknown message M, i.e., A wants to send M to B. B does not know about A and he is ready to respond to any protocol message.
Z intercepts the first message {M}_B from A to B.
Z sends this message to B. Z -> B: {M}_B
Following the protocol, B replies with B -> Z: {M}_Z
Z recovers M, violating the secrecy property
Z can even reencrypt M (Z -> A: {M}_A) in such a way that A does not notice the protocol attack.

Example 2: To avoid the above attack, we include the sender identity in the encryption. Remember that encryption is binding.

A -> B: {M;A}_B
B -> A: {M;B}_A

Is the protocol secure now?

Example 3: Consider the following even more secure version of the protocol that double encrypt the message.

A -> B: {{M}_B ; A}_B
B -> A: {{M}_A ; B}_A

Is the protocol secure? It turns out that this protocol is not secure as demostrated by the following attack:

Z starts a protocol execution between A and B with message M, and intercepts the last message {M'}_A where M' = {M}_A ; B.
Z starts another protocol betweeh Z and A with message M', using its knowledge of {M'}_A
1. Z -> A : {{M'}_A;Z}_A
2. A -> Z : {{M'}_Z; A}_Z
Now Z can decrypt and recover M' = {M}_A ; B. Dropping the last B, this gives {M}_A..
Z starts another interaction with A:
1. Z -> A : {{M}_A; Z}_A
2. A -> Z : {{M}_Z ; A}_Z
At this point, Z can decrypt and recover the original message M which was intented for B only

We observe that the above attack highlights some important points that were addressed by computational cryptography in the 90's: an active attacker can use honest players as "decryption boxes". Security requires form of non-malleability and resistance to chosen ciphertext attacks.

Formal model for describing protocols

The model considered in [DY] focuses on 2 party protocols, executed concurrently in a network with an arbitrary number of partecipants. Let us first describe a single protocol instance:

The protocol involves two parties: S (the sender) and R (the receiver)
S(M,R) takes an input message M, and an identity R of the party S wants to send the message M to.
The receiver is ready to engage in a protocol execution with any sender.
Each protocol step is modeled as a function mapping the last received message to a new message to be transmitted. These functions can be the composition of any number of basic functions chosen from a given set Fx of basic functions available to user X.

[DY] considers two kinds of protocols (corresponding to two sets of basic functions) called cascade protocols and namestamp protocols. The latter is a generalization of the first one, so we concentrate on namestamp protocols. The basic operations available to party X are:

Dx (decryption under X's secret key)
Ey (encryption under any user Y's public key)
iy (append identifier y to the message)
dy (delete identifier y from the end of the message). If input message does not end in y, then abort.
d (delete identifier at end of message)

A two party protocol is formally described as a sequence of strings f[1],f[2],...,f[k] where for any i, f[2i+1] is a string over the function symbols available to S, and f[2i] is a string over the function symbols available to R.

f[1] is the function applied by the sender to the input message M to determine the first message sent to R
f[2i] is the function applied by R to the ith received message to determine the next message to be transmitted to S
f[2i+1] is the function applied by S to the ith received message to determine the next message to be transmitted to R

S and R in the above description are two generic party names, and the protocol can be instantiated replacing S and R with any other pair of parties. Replacing S and R in f[i] with A and B is denoted f[i]{A,B}. The important property is that the protocol definition is symmetric, i.e., applying any permutation to the set of names still gives a valid protocol.

For any i, let F[i](M) = f[i](f[i-1](...f[2](f[1](M))...) be the composition of the first i functions. The sequence of message trasnmitted during the execution of protocol on input M are F[1](M),...F[k](M).

Strings of function symbols are interpreted modulo the following cancellation rules:

Dx Ex = epsilon
Ex Dx = epsilon
dx ix = epsilon
d ix = epsilon

where epsilon is the empty string, representing the identity function. This set of operations can be easily generalized, provided certain properties are satisfied. E.g., strings are taken to represent functions, and in particular, the set of cancellation rules should satisfy the property that if fw = gw for any string w, then f and g are the same function (symbol). The above rules satisfy these properties. An immediate consequence is that if f has both a left and right inverse l f = f r = id, then l = r and this inverse is unique.

Using this formalism the above example protocols are represented as follows:

Example 1: The protocol

S -> R: {M}_R
R -> S: {M}_S

is modeled by the sequence of strings:

Er
Es Dr

Example 2: The protocol

S -> R: {M;S}_R
R -> S: {M;R}_A

is modeled by the sequence of strings:

Er is
Es ir ds Dr

Example 3:The protocol

S -> R: {{M}_R ; S}_R
R -> S: {{M}_S ; R}_S

is modeled by the sequence of strings:

Er is Er
Es ir Es Dr ds Dr

Formal execution model

[DY] considers a model where an active attacker can interfere with the concurrent execution of an arbitrary number of protocol executions. Let U be a potentially infinite pool of user names. Some of the users in U are honest (H) and some are corrupted (C). The attacker can start an arbitrary number of protocol executions between parties in U, honest and disonest ones. The goal of the adversary is to recover the message M underlying a protocol execution between two honest parties A and B. The attacker is assumed to have total control of the network: in other words, the adversary IS the network. It is easy to see that under this execution model the adversary has access to the following functions:

f[i] where i>=1 and S,R are replace by any pair of distinct parties in U
Ex, ix, dx and d for any party X in U
Dx for any dishonest party X in C

Moreover, the adversary can obtain the value f[1]{A,B}(M) for any honest parties A,B. The goal of the adversary is to recover M. Equivalently, the adversary's goal is to find a sequence of functions [g1,....,gk] as above such that gk o ... g2 o g1 o f[1]{A,B} = id for some honest parties A and B.

Notice that given access to the above functions and initial messages f[1]{A,B}(M), one can immediately simulate the concurrent execution of an arbitrary number of protocols in the presence of an adversary that knows the decryption keys of all corrupted parties. So, a protocol is secure in the Dolev-Yao symbolic execution model if and only if the following definition is satisfied:

Definition: Let f[1],...,f[r] be a two party protocol between a sender S and receiver R as defined above. The protocol is insecure if and only if for some honest parties A, B, the adversary has access to a sequence of functions g1,...,gk such that gk o ... g2 o g1 o f[1]{A,B} = id.

The question now is: can security be decided? Can it be decided efficiently? The answer to both question is yes, and the first step toward automating the security analysis is to simplify the definition. The current definition involves infinite choices for many parameters: even after the protocol is fixed (giving a bound in the number and length of the messages in the protocol), we can still have an undecidable problem because the number of parties and concurrent executions is unbounded. We now show that we can always restrict the number of parties to 3: 2 honest parties A, B and the adversary Z.

Theorem: Let f[1],...,f[r] be a Dolev-Yao protocol. If the protocol is insecure, then there is a sequence of functions [g1,...,gk] and pair of parties A,B demonstrating the insecurity, where all the parties involved in the functions are from A,B and Z.

Proof: Assume gk o ... g2 o g1 o f[1]{A,B} = id is an attack (with no restriction on the involved parties). We obtain an attack involving only A, B and Z by replacing all identifiers different from A and B with Z. Since the substitution can only give more cancellations, we still have gk' o ... o g1' o f[1]{A,B} = id. We claim that each gk' is the composition of allowed functions.

If gk is Dx, Ex, ix, dx or d, for some X different from A,B, then the resulting function is Dz, Ez,iz, dz, d and adversary Z is allowed to use this function.
If gk is f[i]{A,B} or f[i]{B,A}, then gk' = gk is an allowed function
If gk is f[i]{A,C}, f[i]{B,C}, f[i]{C,A} or f[i]{C,B} for some C different from A and B, then the new function gk' is identical to gk, except for replacing C with Z
If gk is f[i]{C,D} for two parties C,D not in {A,B}, then gk' = f[i]{Z,Z} is the composition of functions of the form Dz, Ez,iz, dz, d, which are all allowed

This completes the proof of the theorem. [QED]

We use the Theorem to redute the problem of testing the security of a protocol to a special case of the same problem where the number of parties is bounded by 3 and the adversary has access only to a finite numner of functions. The length of the attack is still potentially unbounded, so it is not clear if the problem can be solved algorithmically. [DY] shows that the problem is indeed decidable, and moreover, there is an efficient (polynomial time) decision procedure. The running time of the decision procedure of [DY] is n^8. Here we present a simpler and more efficient algorithm running in time n^3 due to

[DEK] Dolev, Even, Karp, On the security of ping-pong protocols, Inform.and Control 55:57-68 (1982)

The main idea is the following: Consider the set of all words over the alphabet {Ea,Eb,Ez, Da,Db,Dz,ia,ib,iz,da,db,dz,d} that simplify to the empty string using the cancellation rules Dx Ex = Ex Dx =dx ix = d ix = epsilon. This set of words is context free and can be generated by a context free grammar with rules S -> epsilon | Dx S Ex S | ... and so on for all cancellation rules. The grammar can be easily converted into an equivalent Push Down Automaton. Notice that the size of this automaton is constant because it does not depend on the protocol. Next we build a nondeterministic finite automaton accepting all the strings of the form gk o ... o g1 o f[1]{A,B} where each gk is one of the finitely many functions the adversary has access to. The resulting automaton has a number of states proportional to n. Finally, we combine the PDA and NFA using a cartesian product construction to obtain a new PDA that accepts the intersection of the two languages. At this point we are left with the problem of deciding if the language of a PDA is empty or not. This can be done in time O(n^3) using standard techniques. (E.g., convert the PDA into a CFG, and then perform a fixpoint computation to determine if the start symbol in the grammar generates a nonempty language.

The above algorithm takes time and space n^3. [DEK] give an adhoc algorithm that runs in time n^3, but uses only n^2 space. The algorithm is a specialization of the above two procedures and can be applied to any input CFG and NFA.

[DY, DEK] apply only to two party protocols. Two party protocols are easier to analyze because, as shown in [DEK], one can assume without loss of generality that there is only one corrupt user (beside the two running the protocol). So, one need to analyse only scenarios where there are a total of three parties. [EG] shows that as we move to multiparty protocols, things get much harder: for an n-party protocol, one need to consider approximately 3n adversarial users. For fixed n, this still yields a n^3 decision procedure, but if the number of parties is variable, then the problem becomes NP-complete. Also, if we allow for operators that act on the left and right half of the input independently, then the problem becomes undecidable.

[EGS] is an interesting early example of work trying to bridge the symbolic (Dolev-Yao) model and the computational one. It considers a generalization of the Dolev-Yao model where the encryption function is homomorphic (like RSA). Homorphic encryption is interesting because there are many protocols that make use of the homomorphic property of RSA. But this is not addressed in [EGS]. In this paper, homomorphic properties are used only to increase the power of the symbolic adversary. The main result in the paper shows that for ping-pong protocols (like those of [DY], [DEK], which do not use homomorphic properties), the homomorphic properties of RSA do not give the adversary any extra power: if the protocol can be broken exploiting the homorphic properties, then it can be broken by the standard Dolev-Yao adversary. The main contribution of this paper is to show that in order for the Dolev-Yao analysis to make sense, the encryption scheme does not have to be perfect.