Unfortunately, the Reed-Muller code has a serious drawback: The number of bits required to encode a message increases exponentially with the number of variables. If you want a highly local code that corrects errors with just a handful of queries, you’ll need a lot of variables for long messages, and the Reed-Muller code will quickly become useless in practice.
“Exponential in this case is very bad,” Dvir said. But is it inevitable?
Correctable or Decodable?
As computer scientists tried and failed to find more efficient locally correctable codes, they began to suspect that such codes weren’t possible at all. In 2003, two researchers proved that there’s no way to beat the Reed-Muller code using only two queries. But that’s as far as anybody got.
“Once you go to three, our knowledge becomes very sketchy,” Kothari said.
The next breakthrough just complicated matters further. In two papers published in 2008 and 2009, the computer scientists Sergey Yekhanin and Klim Efremenko showed how to construct three-query codes that were more efficient than Reed-Muller codes, but these codes were not quite locally correctable. Instead, they had a subtly different property called local decodability.
To understand the difference, let’s again imagine a cloud storage provider that combines users’ data into one long message and protects it using an error-correcting code. Both locally correctable codes and locally decodable codes can correct an error in any bit of the original message with just a few queries.
But every error-correcting code also requires extra bits that weren’t in the original message — that’s why encoding a message makes it longer. The two types of codes differ in how they treat these additional bits. Locally decodable codes make no promises about the number of queries needed to correct errors in these bits. But in a locally correctable code, an error in any of the extra bits can be fixed in precisely the same way as an error in any bit of the original message.
“Everything that you store, whether it’s the original data of users or the redundancy and the check information — all of this can be locally corrected,” said Madhu Sudan, a computer scientist at Harvard University.
Though different in principle, local correctability and local decodability always seemed interchangeable in practice before 2008 — every known locally decodable code was also locally correctable. Yekhanin and Efremenko’s discovery raised the possibility of a fundamental difference between the two conditions. Or perhaps it was possible to modify Yekhanin and Efremenko’s codes to make them locally correctable. That would put the two conditions on equal footing once more, but it would also mean that researchers had been mistaken about how efficient three-query locally correctable codes could get. Either way, conventional wisdom would have to change.
Borrowing Logic
Kothari and Manohar finally resolved that tension by adapting a technique from a different area of computer science: the study of so-called constraint satisfaction problems. Trying to coordinate dinner plans with a group of friends is a constraint satisfaction problem of a sort. Everyone has choices they’ll accept and choices they’ll veto. Your job is to either find a plan that satisfies everybody, or, if there is no such plan, figure that out as soon as possible.
There’s an inherent asymmetry between those two possible outcomes. An acceptable solution may not be easy to find, but once you have it, it’s easy to convince anyone else that it will work. But even if you know that the problem really is “unsatisfiable,” there may not be an example that provides proof.
In 2021, Kothari and Manohar, together with Venkatesan Guruswami of the University of California, Berkeley, made a major breakthrough in the study of constraint satisfaction problems using a new theoretical technique for identifying those tricky unsatisfiable cases. They suspected that the new method would be a powerful tool for solving other problems too, and Guruswami’s graduate student Omar Alrabiah suggested that they look at three-query locally decodable codes.
“This was a nail with a hammer in our hand, so to speak,” Kothari said.
Yekhanin and Efremenko’s surprising results had shown that three-query locally decodable codes could be more efficient than Reed-Muller codes. But were their codes the best ones possible, or could three-query locally decodable codes get even more efficient? Kothari, Manohar, Guruswami and Alrabiah thought their new technique might be able to prove limits on how efficient such codes could get. Their plan was to build a logical formula encompassing the structure of all possible three-query locally decodable codes of a given size, and prove it unsatisfiable, thereby showing that no such code could exist.
The four researchers took a first step in that direction in 2022, setting a new limit on the maximum efficiency of three-query locally decodable codes. The result went well beyond what researchers had been able to achieve with other techniques, but it didn’t rule out all codes more efficient than Yekhanin and Efremenko’s.
Kothari and Manohar suspected they could go further. But progress stalled until Manohar jotted down a quick back-of-the-envelope calculation indicating that the technique might work even better for locally correctable codes than it had for locally decodable ones.
A few months later, after many more false starts that made them fear they’d been too optimistic, the technique finally delivered on its promise. Kothari and Manohar proved that as researchers had suspected, it’s impossible for any three-query locally correctable code to work appreciably better than Reed-Muller codes. That exponential scaling is a fundamental limitation. Their result was also a dramatic demonstration that local correctability and local decodability, though superficially similar, really do differ at a fundamental level: The latter is unequivocally easier to realize than the former.
Kothari and Manohar now hope to extend their techniques to study codes that are allowed to make more than three queries, as very little is known about them now. And progress in the theory of error correction often has implications in other seemingly unrelated fields. In particular, locally correctable codes make surprise appearances everywhere from the problem of private database searches in cryptography to proofs of theorems in combinatorics. It’s too early to say how Kothari and Manohar’s technique will impact these different fields, but researchers are feeling optimistic.
“There’s a really beautiful new idea here,” Dvir said. “I think there’s a lot of potential.”