Project: Complexity of coding problems

Investigator: D. Micciancio

Ph.D. students: V. Lyubashevsky, (B. Warinschi)

Other collaborators: V. Guruswami, O. Regev, U. Feige, I. Dumer, M. Sudan, O. Goldreich, S. Safra, J.-P. Seifert

Funding: Micciancio’s NSF CAREER Award, Sloan Research Fellowship, (Hellman Fellowship)

Summary: The goal of this project is to get a better understanding of the computational complexity of coding problems, and their relation to point lattices. Our research on the complexity of coding problems closely mirror our study of point lattices. (See Complexity of lattice problems project page.) As part of this project, we have been investigating the following problems:

Computing the minimum distance of a Linear Code

The minimum Hamming distance between any two codewords is a fundamental parameter associated to every code, closely related to its error correction capability. Computing the minimum distance of a linear code is the coding theoretic counterpart of the shortest vector problem for point lattices. The NP-hardness of computing the minimum distance exactly was established by Vardy in 1997, but that paper still left open the possibility that the minimum distance can be efficiently approximated arbitrarily well.

Complexity of Decoding linear codes

The minimum distance decoding problem (usually called nearest codeword problem) is probably the most fundamental computational problem on linear codes. No efficient solution to this problem is known in general, and the problem is known to be NP-hard even to approximate for any constant approximation factor. Typical NP-hardness results for the nearest codeword problem apply only to the most general version of the problem, where one wants to correct arbitrarily many errors in an arbitrary code. In contrast most positive results in algorithmic coding theory apply to specific classes of codes (e.g., Reed-Solomon codes), when the number of errors is limited (e.g., less than half the minimum distance of the code.) In order to get a better understanding of the difficulty of the nearest codeword problem, we investigate the following questions: is the problem still hard if we require the number of errors to be less than the minimum distance of the code? Is it hard if we fix the code in advance and consider consider only the target vector as part of the input instance? Is the problem still NP-hard when restricted to classes of codes of practical interest, e.g. asymptotically good codes?

The hardness of the decoding problem for a fixed family of codes is captured by the Nearest Codeword Problem with Preprocessing (NCPP). The exact version of this problem was proved NP-hard by Bruck and Naor in 1991, but their proof does not extend to inapproximability results.

For the case of binary codes, our result has been improved to factors up to 3 by Regev. For other fields, our is still the strongest known inapproximability result. Proving the NP-hardness of NCPP for any constant approximation factor is an important open problem in the area.

The complexity of approximating NCP when the number of errors is small in comparison to the minimum distance of the code is captured by the Relatively Near Codeword problem (RNC) introduced in (I. Dumer, D. Micciancio, M. Sudan, IEEE Trans. on IT, 2003).

The hardness of the minimum distance problem is proved in (Dumer, Micciancio, Sudan, IEEE Trans. on IT, 2001) by reduction from the RNC decoding problem. A natural question is how do the minimum distance and decoding problem compare?

The Covering Radius Problem

The covering radius is the smallest radius such that Hamming balls centered at all codewords cover the entire space. A classic result of McLaughlin (1982) shows that the covering radius of linear codes is Pi2-hard to compute exactly, and therefore is unlikely to be computable in non-deterministic polynomial time. However, not much was known till recently about the complexity of approximating the covering radius.