Instructor: Daniele Micciancio (UCSD)

Program Obfuscation

Program obfuscation is the task of transforming a program `f` into an equivalent program `O(f)` which is hard to understand or reverse engineer in some technical sense to be specified. For simplicity, we focus on the obfuscation of programs implementing simple functions, from an input value to an output value, and represent these programs as boolean circuits.

Obfuscators should satisfy the correctness requirement

``(O(f))(x) = f(x)``

i.e., the obfuscated program defines the same function as the original one. There are two measures of efficiency for obfuscation:

• The running time of the obfuscator `O`
• The running tiem or circuit size of the obfuscated program `O(f)`

Usually, the output size `|O(f)|` is always required to be polynomial (or at least subexponential) in `|f|`, since without this requirement any circuit (or finite function) can be trivially obfuscated by outputting its truth table.

Clearly, obfuscator efficiency implies output efficiency, as the size of the output `|O(f)|` cannot be larger than the running time of `O(f)`. But it will be interesting to consider inefficient obfuscators that produce a small circuit `O(f)`, without putting a bound on the running time of `O`.

Security Definitions

Black-box obfuscation: For every adversary A, there is a simulator S such that for every program f and distinguisher D

Pr(D(A(O(f)))) ≈ Pr(D(Sf(|f|)))

One may consider several variants of this definition

1. Allow D to take f as input, or have oracle to f. This gives not additional power to D because D may depend on C.

2. Fix A to a dummy adversary A(O(f)) = O(f). This is equivalent to the original definition.

3. Restrict A (and S) to a predicate, i.e., a function with a single output bit. In this case, there is no need for a distinguisher D, and one may simply require Pr(A(O(f)))≈Pr(Sf(|f|)). This can be understood as computing a specific (single bit) property of f.

Proving the equivalence of (1) and (2) to the original definition is a simple exercise. A more interesting problem is to determine if (3) is also equivalent, i.e., is there a (black-box) reduction from general obfuscation to obfuscation against predicates?

By the correctness requirement of obfuscation, anybody with access to `O(f)` can evaluate the original program `f` on any input. The black-box security definition captures the intuition that an adversary should not be able to do anything more than that. But, in fact, having `O(f)` gives more than just the ability of running `f`: it gives a program that computes the same function as `f`. So, a more liberal (and still optimal) definition of obfuscation may be the following.

Best possible obfuscation: For any adversary A, there is a simulator S such that for any two equivalent programs f, f, auxiliary input z and distinguisher D

Pr(D(A(z, O(f)))) ≈ Pr(D(S(z, f′)))

Intuitively, the obfuscated program O(f) does not provide any more information/advantage than any other program f computing the same function as f.

As usual, we may consider several variants:

1. Allow D to take the function f and auxiliary input z. This makes not difference because D may depend arbitrarily on them

2. Fix A to a dummy adversary A(O(f)) = O(f). This requires a universal simulator such that S(z, f′) is indistinguishable from (z, O(f)).

3. Same as above, but without auxiliary input: S(O(f′)) and O(f) are indistinguishable.

4. Restrict A to a predicate. This requires Pr(A(z, O(f)))≈Pr(S(z, f′))

One can also given an indistinguishability based definition of obfuscation, which requires that obfuscation makes equivalent programs indistinguishable.

Indistinguishability obfuscation: For any predicate A and pair of equivalent programs f0, f1,

Pr(A(O(f0)))≈Pr(A(O(f1)))

All definitions of obfuscations can be parametrized with respect to the set of programs that can be obfuscated. If the program f ranges over a set F, then we refer to O as an obfuscator for F. One may also restrict the output of O to be a program in F, in which case, by analogy with the terminology used in learning theory, we refer to O as a proper obfuscator for F.

Relation between the definitions

Theorem 1: If O is a (predicate) black-box obfuscator for F, then O is also a best possible obfuscator for F

Theorem 2: If O is a best possible obfuscator for F, then O is also an indistinguishability obfuscator for F

Theorem 3: If O is an efficient indistinguishability obfuscator for F, then it is also an efficient best possible obfuscator for F

Theorem 4: If F has a best possible obfuscator, then it also has an efficient best possible obfuscator

Possibility and impossibility

Theorem 5: Any F has an inefficient indistinguishability obfuscator

Theorem 6: If F has an efficient back-box obfuscator, then it is learnable

Theorem 7: Even inefficient (predicate) black-box obfuscation for some F does not exist

Theorem 8: If P = NP, then there is a computational best possible obfuscator

Theorem 9: Statistical best possible obfuscation (with computationally unbounded distinguisher) implies PH collapse

Theorem 10: The class of ordered binary decition diagrams (OBDD) has a proper best possible obfuscator, but not a proper black-box obfuscator