Could We Have Felt Evidence For SDP ≠ P?

March 15, 2014

Evaluating mathematical beliefs

Leonid Khachiyan in 1979 caused arguably the most sudden surprise to the West’s popular scientific understanding since the successful launch of Sputnik in 1957. His ellipsoid method gave the first algorithm for linear programming whose polynomial running time was verified. Thanks largely to Narendra Karmarkar it has been superseded by faster interior-point methods, while older algorithms have since been noted to run in polynomial time, but the breakthrough and inspiration came from Khachiyan. Could something like it happen again?

Today Ken and I want to ask whether recent argument over beliefs about ${\mathsf{P=NP}}$ ? can be evaluated in light of this shock.

Khachiyan’s “rocket” had actually left the hangar ten months before, in a January 1979 Doklady Akademii Nauk paper whose title translates into English as, “A polynomial algorithm for linear programming.” As recounted by Berkeley’s Eugene Lawler, it was sighted at a May 1979 meeting Lawler attended in Oberwohlfach, and after Peter Gács and László Lovász supplied proofs missing in the paper, it was pronounced correct at a conference in Montreal. The discovery was picked up in October by Science News, and then by the magazine Science. An allusion by the latter to the NP-complete Traveling Salesman problem was moved to the headline of a Nov. 4 story in England’s Guardian newspaper, and reflected three days later in the New York Times’s front-page screamer, “A Soviet Discovery Rocks World of Mathematics.”

Our point is not to say that linear programming being in ${\mathsf{P}}$ was a surprise. To those who knew the reality behind the headlines, it wasn’t. As Lawler relates, the great George Dantzig had tried to set Times reporter Malcolm Browne straight on this and points related to what LP’s can and (ostensibly) cannot solve. The simplex algorithm already solved the vast majority of ${n\!\times\!n}$ LP cases in ${O(n^3)}$ expected time, so there was no feeling of practical intractability. Rather our point draws on something perhaps less widely known and appreciated: that Khachiyan’s ideas extend to solve a much wider class than linear programs, including so-called semi-definite programs or SDP’s, exactly or with high approximation. Thus it can be said to show that a complexity class ${\mathsf{SDP}}$ defined by “approximation-robust” reductions to these programs equals ${\mathsf{P}}$ .

We raise this with regard to the main technical argument in a recent post by Scott Aaronson titled “The Scientific Case for ${\mathsf{P} \neq \mathsf{NP}}$ .” We wonder whether a similar argument might have seemed on the side of ${\mathsf{SDP \neq P}}$ in the years before Khachiyan. Even more speculatively, we wonder whether a kind of “Reverse Oracle Result” can be formulated to make any of this more concrete. But first let’s review Scott’s comments in the wider context of belief about ${\mathsf{P}}$ vs. ${\mathsf{NP}}$ and about open problems that were resolved in surprising ways.

Scott’s Comments and Our Positions

Essentially Scott gave a quite reasonable argument for ${\mathsf{P \neq NP}}$ , in his usual elegant and convincing style. Bill Gasarch expanded it. But. But mathematics is not something we argue about like: who was the best hockey player of all time, or what is the right philosophy? The simple fact is that no one has proved that ${\mathsf{P \neq NP}}$ .

Our purpose with our recent post on a 13-GB certificate of unsatisfiability was not to start a discussion about ${\mathsf{P \neq NP}}$ , but rather to witness that ${\mathsf{NP}}$ -hardness is not so great a practical obstacle as we may think. The Gröbner basis algorithm is run all the time, despite the problem it solves being complete for exponential space. Notably it runs in singly-exponential time on a generic set of cases. If we can shift down a level, this is like having “smoothed polynomial-time behavior” of an algorithm for a ${\mathsf{PSPACE}}$ -complete problem. Solving nontrivial cases of ${\mathsf{NP}}$ -hard problems is addictive.

Almost the entire business model of this company is to solve ${\mathsf{NP}}$ -hard optimization problems, using non-quantum computers. As is evident from examples on their website, they are not just gristing easy approximation for run-of-the-mill instances. To quote one of their blog entries (their emphasis):

According to academic research on NP-hard problems, it’s impossible to guarantee that optimal solutions to difficult problems will be found in a reasonable time frame. However, with almost all real-life planning puzzles, you can get excellent results very quickly.

Hence our initial reaction was, who cares about discussions on ${\mathsf{P \neq NP}}$ being true or not, aside from progress on this great question? A full solution would be wonderful, but just having small steps would be great, even a possible program for a solution would be welcome. So that was what we thought we should just say, nothing more, except noting our split answers to Bill G’s reprised poll three years ago.

But. But Dick couldn’t resist adding some more sections, while Ken made some effort to counter Scott’s facts, counterfactually.

Dick Speaks

I feel compelled to explain why I am open-minded on this question perhaps more than anyone else. I have several reasons that I feel are important to remind all of us:

In mathematics people have guessed wrong before on famous open questions.
The actual theorems about why ${\mathsf{P}}$ is weaker than ${\mathsf{NP}}$ are extremely hard and very weak.
If ${\mathsf{P} = \mathsf{NP}}$ it could still be that ${\mathsf{SAT}}$ is computable in time ${ n^{2^{1000}}. }$
If ${\mathsf{P \neq NP}}$ it could still be that ${\mathsf{SAT}}$ is computable in time ${n^{\log n}}$ —while the evidence cited by Scott is properly for an exponential lower bound.

We’ll address the full ${\mathsf{P}}$ versus ${\mathsf{NP}}$ question, and not situations where say the algorithm generating ${\mathsf{SAT}}$ instances is restricted to ${r(n)}$ random bits—a case in which we’ve noted that in the limit one can solve them all in something like time ${\mathsf{poly}(n)2^{r(n)}}$ .

Bad Guesses

I have discussed guesses in mathematics many times before on this blog. One of the biggest issues in guessing wrong is that people do not take seriously the other possibility. Researchers tend not to work on showing ${\mathsf{P} = \mathsf{NP}}$ anymore. Research support does not go there, since we all “know” that it would be a waste of time, and there are other consequences to the field.

Here are some famous guesses that were essentially off by exponentials. For each I will list the time gap between the initial problem being raised and being solved.

The Borsuk Conjecture [60 years]. It was claimed that ${d+1}$ was the answer to this geometric conjecture. Many top people worked on it for years and partial results were proved, showing ${d+1}$ is correct in many important cases. Then Gil Kalai and Jeff Kahn proved that ${d+1}$ was slightly off; the correct answer is exponential—at least ${c^{\sqrt{d}}}$ for some ${c>1}$ , and possibly with ${d}$ in the exponent.
Barrington’s Theorem [26 years]. I worked on lower bounds to show that bounded width computation could not compute the majority function. This was joint work with Ashok Chandra and Merrick Furst, which introduced multi-party communication. Then David Barrington came along and proved that bounded width could do all of ${\mathsf{NC}^{1}}$ . Hmmmm.
Law of Small Numbers [118 years].
It was noticed by Gauss that
$\displaystyle \pi(x) < li(x),$

for all reasonable size ${x}$ . Here ${li(x)}$ is the logarithmic intergal

$\displaystyle \int_{0}^{x}\frac{dt}{\ln t,}$

which is an asympotic approximation to ${\pi(x)}$ , the number of primes less than ${x}$ . It was conjectured that this would always hold and was widely believed for over a century. Then John Littlewood proved that the lead changes between ${\pi(x)}$ and ${li(x)}$ infinitely often, although the first switch is upper bounded by an immense number. Richard Guy wrote a wonderful article on what he called the “The Strong Law of Small Numbers”: cases when evident phenomena held for small numbers but eventually would fail. Here is a table with other examples:

By the way the “common clock” on ${\mathsf{P} \neq {NP}}$ is 43 years.

Weak Theorems and Galactic Algorithms

We do have the theorem that ${\mathsf{DTIME}(n)}$ is not equal to ${\mathsf{NTIME}(n)}$ , which we have discussed before and which is particular to the multitape Turing machine model—make the tapes planes or trees and it goes away. We cannot even deduce from it that ${\mathsf{DTIME}(n^{2}) \neq \mathsf{NTIME}(n^{2}). }$ That’s pretty weak. Remember that ${\mathsf{P} \neq \mathsf{NP}}$ means that ${\mathsf{DTIME}(n^{1000})}$ does not contain ${\mathsf{NTIME}(n)}$ . And more, of course.

The ${\mathsf{P}}$ versus ${\mathsf{NP}}$ statement still allows cross-cutting the generally-understood significance. That is:

${\mathsf{P = NP}}$ allows ${\mathsf{SAT}}$ to have no less than an ${n^{1,000}}$ -time algorithm.
${\mathsf{P \neq NP}}$ still allows ${\mathsf{SAT}}$ to have an ${n^{\log^{*} n}}$ -time algorithm.

To be sure, some evidence cited by Scott is really for an exponential lower bound on ${\mathsf{SAT}}$ ; we have discussed this before too. But what we are saying still cuts against the usual argument that “many people have worked on looking for algorithms for ${\mathsf{SAT}}$ .” Yes many have looked for algorithms, but most were interested in “real” practical algorithms. For this kind of quest there is not much difference between ${n^{20}}$ and ${2^{n}}$ .

Aram Harrow communicates to us a more-concrete version of this point, which also serves as a bridge to Ken’s musings.

Aram’s Bridging Thoughts

Quoting Aram, with light editing: “One of the stronger reasons for ${\mathsf{P \neq NP}}$ is the Bayesian one—it is easier to find algorithms than lower bounds, so our failure to find a subexponential-time algorithm for ${\mathsf{3SAT}}$ speaks louder than our failure to find a super-linear lower bound for ${\mathsf{3SAT}}$ . A related way of expressing this is that before ${\mathsf{NP}}$ -completeness was understood, thousands of researchers in disparate fields were unwittingly all trying to put ${\mathsf{3SAT}}$ into ${\mathsf{P}}$ .

But a counter-argument is that all of those seemingly independent researchers would always come up with algorithms that relied on a few kinds of structure—a lot is covered by just two kinds:

Convexity/absence of local minima—cases where greedy algorithms work.
Tree structure—cases where dynamic programming, divide-and-conquer, and related ideas work.

This paucity of algorithmic variety can be viewed in (at least) two ways:

Algorithms, like oil deposits, are a finite resource. We’ve found all the important ones, and ${\mathsf{3SAT}}$ is always going to require exponential time.
The paucity is a sociological phenomenon, due to the fact that all researchers have learned essentially similar math backgrounds.

On the latter, Terry Tao’s recent breakthrough on the Navier-Stokes equations is an example of how much the same ideas keep recirculating, and how much more quickly progress can be made by cross-applying ideas rather than coming up with radically new ones. Going from Erwin Schrödinger’s equation to Peter Shor’s quantum factoring algorithm is a 60-minute lecture, but it took over 60 years (and a change in perspective coming from the computer revolution) to discover. Our lack of algorithms reveals only our lack of creativity, and it is arrogant to posit fundamental limits to mathematics just because we can’t solve a problem. Either way, the central question isn’t so much about ${\mathsf{NP}}$ but rather a “throwback” of a question now being asked about quantum computers:

Where does the power of deterministic algorithms come from?

A related question is the power of the Lasserre hierarchy. It has been shown to be effective for a large number of problems, but with a surprisingly small number of truly different techniques. I would love to know whether further work will increase or decrease the number of ways in which we know how to use it; that is, either by discovering new methods or by unifying apparently different methods.”

Ken’s Sixth World

The Lasserre hierarchy builds upon LP’s and SDP’s, and this brings us back to the intro. I (still Dick) remember many people in the 1970’s trying to prove that certain linear/convex programming problems were ${\mathsf{NP}}$ -hard, despite all our confidence in the simplex algorithm for daily use. This makes Ken wonder:

What if SDP’s really were hard?

Russell Impagliazzo famously categorized five worlds that are consistent with current knowledge of complexity. Is there room to analyze any more, ones that are inconsistent now, but might have been meaningfully consistent had our field taken a different path?

All of Impagliazzo’s worlds—including ones with ${\mathsf{P \neq NP}}$ and with ${\mathsf{P = NP}}$ —have been instantiated via oracle results. All oracle results involve pretending that some ostensibly hard problem is easy. For instance, the world with ${\mathsf{P = NP}}$ involves pretending ${\mathsf{PSPACE}}$ -complete problems are easy, while known ones for ${\mathsf{P \neq NP}}$ involve granting free access to a set coded so that the free access itself activates a diagonalization. What I (Ken) wonder is whether there is a sensible formal way to do “Reverse Oracle Results,” which pretend that some easy problem ${X}$ is hard.

One known way to get this effect is to narrow the definition of “easy” so that ${X}$ still has easy reductions to ${X}$ from other problems ${W}$ . For example, linear programming problems are P-complete under logspace (and even easier) reductions, as are problems of approximation by SDP’s. But here I mean something more structural—a sense in which ${X}$ is the only route to solving a whole class of problems ${Y}$ . Then we can segregate this entire class and pretend it all is hard. It might suffice to give “easy” reductions from ${X}$ to all these ${Y}$ . In particular, a lot of ${Y}$ ‘s are solved via the interior-point paradigm for (LP’s and) SDP’s. It could also employ ideas in reverse mathematics.

Countering Factuals with Counterfactuals

Scott replied to my comment about a possible ${n^{20}}$ algorithm for ${\mathsf{SAT}}$ in his post by referring to his earlier comment that:

Since it would be inelegant and unnatural for the class ${\mathsf{P}}$ to be “severed into two” in this way, I’d say the much likelier possibility is simply that ${\mathsf{P \neq NP}}$ .

Our point is, perhaps ${\mathsf{P}}$ remains already manifestly “severed into two” along Khachiyan’s and the interior-point fault-lines. In particular, we wonder how the following consequences would have looked as conditional results had they been proved in the late 1970’s:

Theorem 1 If ${\mathsf{SDP = P}}$ , then it is possible to compute in polynomial time close approximations to a function ${\vartheta(G)}$ of undirected graphs ${G}$ that is sandwiched between the ${\mathsf{NP}}$ -complete clique number ${\omega(G)}$ and chromatic number ${\chi(G)}$ , which all coincide whenever neither ${G}$ nor its complement has an odd induced cycle.

Theorem 2 If ${\mathsf{SDP = P}}$ , then ${\mathsf{MAX CUT}}$ is polynomial-time approximable within ${0.87856\dots}$ , even though it is ${\mathsf{NP}}$ -hard to approximate within ${0.941\dots}$ .

In the first theorem we have also inverted time by embracing the Strong Perfect Graph Theorem which was proved in 2002 (final in 2006), but this conjecture was strongly felt and makes it transparent that many important families of graphs are perfect. Hence the sweeping tractability of major ${\mathsf{NP}}$ -complete problems on these cases could be a surprise. On the second theorem, why should the difference between ${0.878}$ and ${0.9412}$ matter to such a simple problem as ${\mathsf{MAX CUT}}$ ?

Of course in the light of knowledge we understand how these two famous theorems work. On the latter the Unique Games Conjecture already helps explain how ${0.87856\dots}$ may be special. But the present exercise is about how we reason when we don’t (yet) have the light of knowledge.

Open Problems

Can we make some formal sense of a world where Khachiyan’s breakthrough never happens?

Update 3/18/14: The comments section has some excellent replies, including ones on the argument of the last two sections by Scott Aaronson and by Timothy Gowers.

Update 3/17/14: The Lovász theta-function example replaced the post’s original quoting of Leslie Valiant’s first “Accidental Algorithm” under a mis-memory that LP’s were involved (and SDP’s in later developments). Snipping the irrelevant qualifier, it reads:

Theorem 3 Counting assignments to a certain class of planar formulas is deterministic polynomial-time computable modulo ${7}$ , or modulo any Mersenne prime, even though it is ${\mathsf{NP}}$ -hard modulo ${3}$ or ${5}$ .

Aside from the fact that this computation belongs to a class called ${\mathsf{GapL}}$ which is commonly believed to be a proper subclass of ${\mathsf{P}}$ , the intent was to argue: “What do Mersenne primes have to do with matchings and convex programming really? Surely counting must be equally hard modulo any odd prime—after all there’s no exception for Mersenne in modular circuit lower bounds—so ${\mathsf{SDP \neq P}}$ must be an ‘invisible fence’ around this kind of ridiculousness.” The intro and Borsuk statements were also amended.

Could We Have Felt Evidence For SDP ≠ P?

Could We Have Felt Evidence For SDP ≠ P?

Scott’s Comments and Our Positions

Dick Speaks

Bad Guesses

Weak Theorems and Galactic Algorithms

Aram’s Bridging Thoughts

Ken’s Sixth World

Countering Factuals with Counterfactuals

Open Problems

Like this:

Recommend

Animated GIF thumbnails from a video

Let's play with playwright using F# scripts.

7月20日晚间重要动态一览

技术同学在业务中的成长

VLAN之access,trunk,hybrid三种模式区别

WebDriver基础-Driver和WebElement | CHEGVA

0716 - 趁着还不老

0715 - 穷人的教育方式

教会的大姨妈

A quick look at the O3DE game engine and building it with Meson

About Joyk