clumma's blurblog

Sora 2 is here
Wednesday October 1^st, 2025 at 1:39 PM

OpenAI News

Our latest video generation model is more physically accurate, realistic, and controllable than prior systems. It also features synchronized dialogue and sound effects. Create with it in the new Sora app.

Read the whole story

clumma

1 day ago

reply

Berkeley, CA

The QMA Singularity by Scott
Sunday September 28^th, 2025 at 12:01 PM

Shtetl-Optimized

A couple days ago, Freek Witteveen of CWI and I posted a paper to the arXiv called “Limits to black-box amplification in QMA.” Let me share the abstract:

We study the limitations of black-box amplification in the quantum complexity class QMA. Amplification is known to boost any inverse-polynomial gap between completeness and soundness to exponentially small error, and a recent result (Jeffery and Witteveen, 2025) shows that completeness can in fact be amplified to be doubly exponentially close to 1. We prove that this is optimal for black-box procedures: we provide a quantum oracle relative to which no QMA verification procedure using polynomial resources can achieve completeness closer to 1 than doubly exponential, or a soundness which is super-exponentially small. This is proven by using techniques from complex approximation theory, to make the oracle separation from (Aaronson, 2008), between QMA and QMA with perfect completeness, quantitative.

You can also check out my PowerPoint slides here.

To explain the context: QMA, or Quantum Merlin Arthur, is the canonical quantum version of NP. It’s the class of all decision problems for which, if the answer is “yes,” then Merlin can send Arthur a quantum witness state that causes him to accept with probability at least 2/3 (after a polynomial-time quantum computation), while if the answer is “no,” then regardless of what witness Merlin sends, Arthur accepts with probability at most 1/3. Here, as usual in complexity theory, the constants 2/3 and 1/3 are just conventions, which can be replaced (for example) by 1-2^-n and 2^-n using amplification.

A longstanding open problem about QMA—not the biggest problem, but arguably the most annoying—has been whether the 2/3 can be replaced by 1, as it can be for classical MA for example. In other words, does QMA = QMA₁, where QMA₁ is the subclass of QMA that admits protocols with “perfect completeness”? In 2008, I used real analysis to show that there’s a quantum oracle relative to which QMA ≠ QMA₁, which means that any proof of QMA = QMA₁ would need to use “quantumly nonrelativizing techniques” (not at all an insuperable barrier, but at least we learned something about why the problem is nontrivial).

Then came a bombshell: in June, Freek Witteveen and longtime friend-of-the-blog Stacey Jeffery released a paper showing that any QMA protocol can be amplified, in a black-box manner, to have completeness error that’s doubly exponentially small, 1/exp(exp(n)). They did this via a method I never would’ve thought of, wherein a probability of acceptance is encoded via the amplitudes of a quantum state that decrease in a geometric series. QMA, it turned out, was an old friend that still had surprises up its sleeve after a quarter-century.

In August, we had Freek speak about this breakthrough by Zoom in our quantum group meeting at UT Austin. Later that day, I asked Freek whether their new protocol was the best you could hope to do with black-box techniques, or whether for example one could amplify the completeness error to be triply exponentially small, 1/exp(exp(exp(n))). About a week later, Freek and I had a full proof written down that, using black-box techniques, doubly-exponentially small completeness error is the best you can do. In other words: we showed that, when one makes my 2008 QMA ≠ QMA₁ quantum oracle separation quantitative, one gets a lower bound that precisely matches Freek and Stacey’s protocol.

All this will, I hope, interest and excite aficianados of quantum complexity classes, while others might have very little reason to care.

But here’s a reason why other people might care. This is the first paper I’ve ever put out for which a key technical step in the proof of the main result came from AI—specifically, from GPT5-Thinking. Here was the situation: we had an N×N Hermitian matrix E(θ) (where, say, N=2ⁿ), each of whose entries was a poly(n)-degree trigonometric polynomial in a real parameter θ. We needed to study the largest eigenvalue of E(θ), as θ varied from 0 to 1, to show that this λ_max(E(θ)) couldn’t start out close to 0 but then spend a long time “hanging out” ridiculously close to 1, like 1/exp(exp(exp(n))) close for example.

Given a week or two to try out ideas and search the literature, I’m pretty sure that Freek and I could’ve solved this problem ourselves. Instead, though, I simply asked GPT5-Thinking. After five minutes, it gave me something confident, plausible-looking, and (I could tell) wrong. But rather than laughing at the silly AI like a skeptic might do, I told GPT5 how I knew it was wrong. It thought some more, apologized, and tried again, and gave me something better. So it went for a few iterations, much like interacting with a grad student or colleague. Within a half hour, it had suggested to look at the function

$$ Tr[(I-E(\theta))^{-1}] = \sum_{i=1}^N \frac{1}{1-\lambda_i(\theta)}. $$

It pointed out, correctly, that this was a rational function in θ of controllable degree, that happened to encode the relevant information about how close the largest eigenvalue λ_max(E(θ)) is to 1. And this … worked, as we could easily check ourselves with no AI assistance. And I mean, maybe GPT5 had seen this or a similar construction somewhere in its training data. But there’s not the slightest doubt that, if a student had given it to me, I would’ve called it clever. Obvious with hindsight, but many such ideas are.

I had tried similar problems a year ago, with the then-new GPT reasoning models, but I didn’t get results that were nearly as good. Now, in September 2025, I’m here to tell you that AI has finally come for what my experience tells me is the most quintessentially human of all human intellectual activities: namely, proving oracle separations between quantum complexity classes. Right now, it almost certainly can’t write the whole research paper (at least if you want it to be correct and good), but it can help you get unstuck if you otherwise know what you’re doing, which you might call a sweet spot. Who knows how long this state of affairs will last? I guess I should be grateful that I have tenure.

Read the whole story

clumma

4 days ago

reply

Berkeley, CA

Gemini Robotics 1.5 brings AI agents into the physical world
Thursday September 25^th, 2025 at 4:04 PM

Google DeepMind Blog

New era of physical agents will help robots perceive, plan, think, use tools and act to solve complex tasks.

Read the whole story

clumma

7 days ago

reply

Berkeley, CA

Measuring the performance of our models on real-world tasks
Thursday September 25^th, 2025 at 4:04 PM

OpenAI News

OpenAI introduces GDPval, a new evaluation that measures model performance on real-world economically valuable tasks across 44 occupations.

Read the whole story

clumma

7 days ago

reply

Berkeley, CA

Introducing ChatGPT Pulse
Thursday September 25^th, 2025 at 4:04 PM

OpenAI News

Today we're releasing a preview of ChatGPT Pulse to Pro users on mobile. Pulse is a new experience where ChatGPT proactively does research to deliver personalized updates based on your chats, feedback, and connected apps like your calendar.

Read the whole story

clumma

7 days ago

reply

Berkeley, CA

Cap'n Web: a new RPC system for browsers and web servers by jgrahamc
Thursday September 25^th, 2025 at 4:04 PM

Hacker News - beefman's favorites

Article URL: https://blog.cloudflare.com/capnweb-javascript-rpc-library/

Comments URL: https://news.ycombinator.com/item?id=45332883

Points: 634

# Comments: 291

Read the whole story

clumma

7 days ago

reply

Berkeley, CA

Sora 2 is here Wednesday October 1st, 2025 at 1:39 PM

The QMA Singularity by Scott Sunday September 28th, 2025 at 12:01 PM

Gemini Robotics 1.5 brings AI agents into the physical world Thursday September 25th, 2025 at 4:04 PM

Measuring the performance of our models on real-world tasks Thursday September 25th, 2025 at 4:04 PM

Introducing ChatGPT Pulse Thursday September 25th, 2025 at 4:04 PM

Cap'n Web: a new RPC system for browsers and web servers by jgrahamc Thursday September 25th, 2025 at 4:04 PM

Sora 2 is here
Wednesday October 1^st, 2025 at 1:39 PM

The QMA Singularity by Scott
Sunday September 28^th, 2025 at 12:01 PM

Gemini Robotics 1.5 brings AI agents into the physical world
Thursday September 25^th, 2025 at 4:04 PM

Measuring the performance of our models on real-world tasks
Thursday September 25^th, 2025 at 4:04 PM

Introducing ChatGPT Pulse
Thursday September 25^th, 2025 at 4:04 PM

Cap'n Web: a new RPC system for browsers and web servers by jgrahamc
Thursday September 25^th, 2025 at 4:04 PM