# Project: Complexity of coding problems

Investigator: D. Micciancio

Ph.D. students: V. Lyubashevsky, (B. Warinschi)

Other collaborators: V. Guruswami, O. Regev, U. Feige, I. Dumer, M. Sudan, O. Goldreich, S. Safra, J.-P. Seifert

Funding: Micciancio’s NSF CAREER Award, Sloan Research Fellowship, (Hellman Fellowship)

Summary: The goal of this project is to get a better understanding of the computational complexity of coding problems, and their relation to point lattices. Our research on the complexity of coding problems closely mirror our study of point lattices. (See Complexity of lattice problems project page.) As part of this project, we have been investigating the following problems:

### Computing the minimum distance of a Linear Code

The minimum Hamming distance between any two codewords is a fundamental parameter associated to every code, closely related to its error correction capability. Computing the minimum distance of a linear code is the coding theoretic counterpart of the shortest vector problem for point lattices. The NP-hardness of computing the minimum distance exactly was established by Vardy in 1997, but that paper still left open the possibility that the minimum distance can be efficiently approximated arbitrarily well.

• In Hardness of approximating the minimum distance of a linear code, (IEEE Trans. on Information Theory, 2003), D. Micciancio with collaborators I. Dumer and M. Sudan, showed that the minimum distance of a linear code is NP-hard to compute even approximately, within any constant approximation factor. The proof is based on some general ideas from an analogous result for lattices of (Micciancio, SICOMP 2001), but the underlying mathematics is quite different. This paper also establishes inapproximability results for the minimum distance of asymptotically good codes.

### Complexity of Decoding linear codes

The minimum distance decoding problem (usually called nearest codeword problem) is probably the most fundamental computational problem on linear codes. No efficient solution to this problem is known in general, and the problem is known to be NP-hard even to approximate for any constant approximation factor. Typical NP-hardness results for the nearest codeword problem apply only to the most general version of the problem, where one wants to correct arbitrarily many errors in an arbitrary code. In contrast most positive results in algorithmic coding theory apply to specific classes of codes (e.g., Reed-Solomon codes), when the number of errors is limited (e.g., less than half the minimum distance of the code.) In order to get a better understanding of the difficulty of the nearest codeword problem, we investigate the following questions: is the problem still hard if we require the number of errors to be less than the minimum distance of the code? Is it hard if we fix the code in advance and consider consider only the target vector as part of the input instance? Is the problem still NP-hard when restricted to classes of codes of practical interest, e.g. asymptotically good codes?

The hardness of the decoding problem for a fixed family of codes is captured by the Nearest Codeword Problem with Preprocessing (NCPP). The exact version of this problem was proved NP-hard by Bruck and Naor in 1991, but their proof does not extend to inapproximability results.

• In The inapproximability of lattice and coding problems with preprocessing (Journal of Computer and System Sciences, JCSS 2004) D. Micciancio and collaborator U. Feige, prove the first inapproximability result for NCPP, showing that approximating NCPP (for codes over GF(q)) is NP-hard for any q and approximation factor up to 5/3. Similar results are obtained for lattices giving a reduction from NCPP to the closest vector problem with preprocessing (CVPP).

For the case of binary codes, our result has been improved to factors up to 3 by Regev. For other fields, our is still the strongest known inapproximability result. Proving the NP-hardness of NCPP for any constant approximation factor is an important open problem in the area.

The complexity of approximating NCP when the number of errors is small in comparison to the minimum distance of the code is captured by the Relatively Near Codeword problem (RNC) introduced in (I. Dumer, D. Micciancio, M. Sudan, IEEE Trans. on IT, 2003).

• In Hardness of approximating the minimum distance of a linear code, (IEEE Trans. on Information Theory, 2003), D. Micciancio with collaborators I. Dumer and M. Sudan, show that the Relatively Near Codeword problem is NP-hard to approximate, even when the distance from the code (i.e., the amount of error) is arbitrarily close to the unique decoding radius (i.e., half the minimum distance of the code.) This paper also proves similar results (when the error is arbitrarily close to (2/3) the minimum distance) for asymptotically good codes. The hardness of the RNC problem is used as an intermediate step to prove the hardness of approximating the minimum distance.

The hardness of the minimum distance problem is proved in (Dumer, Micciancio, Sudan, IEEE Trans. on IT, 2001) by reduction from the RNC decoding problem. A natural question is how do the minimum distance and decoding problem compare?

• In Approximating shortest lattice vectors is not harder than approximating closest lattice vectors (Information Processing Letters, 1999) D. Micciancio with collaborators O.Goldreich, S. Safra and J.-P. Seifert, adapt similar results for lattices (proved in the same paper) to show that approximately computing the minimum distance of a linear code is efficiently reducible to approximating the nearest codeword problem. The reduction applies to arbitrary approximation factors g, and is approximation preserving, in the sense that g-approximations to the nearest codeword yield g-approximations of the minimum distance for the same g.