Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's fascinating that this has run into the exact same problem as the Quantum research. Ie, in the quantum research to demonstrate any valuable forward progress you must compute something that is impossible to do with a traditional computer. If you can't do it with a traditional computer, it suddenly becomes difficult to verify correctness (ie, you can't just check it was matching the traditional computer's answer.

In the same way ChatGPT scores 25% on this and the question is "How close were those 25% to questions in the training set". Or to put it another way we want to answer the question "Is ChatGPT getting better at applying it's reasoning to out-of-set problems or is it pulling more data into it's training set". Or "Is the test leaking into the training".

Maybe the whole question is academic and it doesn't matter, we solve the entire problem by pulling all human knowledge into the training set and that's a massive benefit. But maybe it implies a limit to how far it can push human knowledge forward.



>in the quantum research to demonstrate any valuable forward progress you must compute something that is impossible to do with a traditional computer

This is factually wrong. The most interesting problems motivating the quantum computing research are hard to solve, but easy to verify on classical computers. The factorization problem is the most classical example.

The problem is that existing quantum computers are not powerful enough to solve the interesting problems, so researchers have to invent semi-artificial problems to demonstrate "quantum advantage" to keep the funding flowing.

There is a plethora of opportunities for LLMs to show their worth. For example, finding interesting links between different areas of research or being a proof assistant in a math/programming formal verification system. There is a lot of ongoing work in this area, but at the moment signal-to-noise ratio of such tools is too low for them to be practical.


No, it is factually right, at least if Scott Aaronson is to be believed:

> Having said that, the biggest caveat to the “10^25 years” result is one to which I fear Google drew insufficient attention. Namely, for the exact same reason why (as far as anyone knows) this quantum computation would take ~10^25 years for a classical computer to simulate, it would also take ~10^25 years for a classical computer to directly verify the quantum computer’s results!! (For example, by computing the “Linear Cross-Entropy” score of the outputs.) For this reason, all validation of Google’s new supremacy experiment is indirect, based on extrapolations from smaller circuits, ones for which a classical computer can feasibly check the results. To be clear, I personally see no reason to doubt those extrapolations. But for anyone who wonders why I’ve been obsessing for years about the need to design efficiently verifiable near-term quantum supremacy experiments: well, this is why! We’re now deeply into the unverifiable regime that I warned about.

https://scottaaronson.blog/?p=8525


It's a property of the "semi-artificial" problem chosen by Google. If anything, it means that we should heavily discount this claim of "quantum advantage", especially in the light of inherent probabilistic nature of quantum computations.

Note that the OP wrote "you MUST compute something that is impossible to do with a traditional computer". I demonstrated a simple counter-example to this statement: you CAN demonstrate forward progress by factorizing big numbers, but the problem is that no one can do it despite billions of investments.


Apparently they can't, right now, as you admit. Anyway this is turning into a stupid semantic argument, have a nice day.


If they can't, then is it really quantum supremacy?

They claimed it last time in 2019 with Sycamore, which could perform in 200 seconds a calculation that Google claimed would take a classical supercomputer 10,000 years.

That was debunked when a team of scientists replicated the same thing on an ordinary computer in 15 hours with a large number of GPUs. Scott Aaronson said that on a supercomputer, the same technique would have solved the problem in seconds.[1]

So if they now come up with another problem which they say cannot even be verified by a classical computer and uses it to claim quantum advantage, then it is right to be suspicious of that claim.

1. https://www.science.org/content/article/ordinary-computers-c...


> If they can't, then is it really quantum supremacy?

Yes, quantum supremacy on an artificial problem is quantum supremacy (even if it's "this quantum computer can simulate itself faster than a classical computer"). Quantum supremacy on problems that are easy to verify would of course be nicer, but unfortunately not all problems happen to have an easy verification.


that applies specifically to this artificial problem google created to be hard for classical computers and in fact in the end it turned out it was not so much. IBM came up with a method to do what google said it would take 10.000 years on a classical computers in just 2 days. I would not be surprised if a similar reduction happened also to their second attempt if anyone was motivated enough to look at it.

In general we have thousands of optimisations problems that are hard to solve but immediate to verify.


the unverifiable regime is a great way to extract funding.


> This is factually wrong.

What's factually wrong about it? OP said "you must compute something that is impossible to do with a traditional computer" which is true, regardless of the output produced. Verifying an output is very different from verifying the proper execution of a program. The difference between testing a program and seeing its code.

What is being computed is fundamentally different from classical computers, therefore the verification methods of proper adherence to instructions becomes increasingly complex.


They left out the key part which was incorrect and the sentence right after "If you can't do it with a traditional computer, it suddenly becomes difficult to verify correctness"

The point stands that for actually interesting problems verifying correctness of the results is trivial. I don't know if "adherence to instructions" transudates at all to quantum computing.


> This is factually wrong. The most interesting problems motivating the quantum computing research are hard to solve, but easy to verify on classical computers.

You parent did not talk about quantum computers. I guess he rather had predictions of novel quantum-field theories or theories of quantum gravity in the back of his mind.


Then his comment makes even less sense.


I agree with the issue of ”is the test dataset leaking into the training dataset” being an issue with interpreting LLM capabilities in novel contexts, but not sure I follow what you mean on the quantum computing front.

My understanding is that many problems have solutions that are easier to verify than to solve using classical computing. e.g. prime factorization


Oh it's a totally different issue on the quantum side that leads to the same issue with difficulty verifying. There, the algorithms that Google for example is using today, aren't like prime factorization, they're not easy to directly verify with traditional computers, so as far as I'm aware they kind of check the result for a suitably small run, and then do the performance metrics on a large run that they hope gave a correct answer but aren't able to directly verify.


If constrained by existing human knowledge to come up with an answer, won’t it fundamentally be unable to push human knowledge forward?


Depends on your understanding of human knowledge I guess? People talk about the frontier of human knowledge and if your view of knowledge is like that of a unique human genius pushing forward the frontier then yes - it'd be stuck. But if you think of knowledge as more complex than that you could have areas that are kind of within our frontier of knowledge (that we could reasonably know, but don't actually know) - taking concepts that we already know in one field and applying them to some other field. Today the reason that doesn't happen is because genius A in physics doesn't know about the existence of genius B in mathematics (let alone understand their research), but if it's all imbibed by "The Model" then it's trivial to make that discovery.


I was referring specifically to the parent comments statements around current AI systems.


Reasoning is essentially the creation of new knowledge from existing knowledge. The better the model can reason the less constrained it is to existing knowledge.

The challenge is how to figure out if a model is genuinely reasoning


Reasoning is a very minor (but essential) part of knowledge creation.

Knowledge creation comes from collecting data from the real world, and cleaning it up somehow, and brainstorming creative models to explain it.

NN/LLM's version of model building is frustrating because it is quite good, but not highly "explainable". Human models have higher explainability, while machine models have high predictive value on test examples due to an impenetrable mountain of algebra.


There are likely lots of connections that could be made that no individual has made because no individual has all of existing human knowledge at their immediate disposal.


I don't think many expect AI to push knowledge forward? A thing that basically just regurgitates consensus historic knowledge seems badly suited to that


But apparently these new frontier models can 'reason' - so with that logic, they should be able to generate new knowledge?


O1 was able to find the math problem in a recently published paper, so yes.


Then much of human research and development is also fundamentally impossible.


Only if you think current "AI" is on the same level as human creativity and intelligence, which it clearly is not.


I think current "AI" (i.e. LLMs) is unable to push human knowledge forward, but not because it's constrained by existing human knowledge. It's more like peeking into a very large magic-8 ball, new answers everytime you shake it. Some useful.


That's not a strong reason. Yes, that means ChatGPT isn't good at wholly independently pushing knowledge forward, but a good brainstormer that is right even 10% of the time is an incredible found of knowledge.


It may be able to push human knowledge forward to an extent.

In the past, there was quite a bit of low hanging fruit such that you could have polymaths able to contribute to a wide variety of fields, such as Newton.

But in the past 100 years or so, the problem is there is so much known, it is impossible for any single person to have deep knowledge of everything. e.g. its rare to find a really good mathematician who also has a deep knowledge (beyond intro courses) about say, chemistry.

Would a sufficiently powerful AI / ML model be able to come up with this synthesis across fields?


How much of this could be resolved if its training set were reduced? Conceivably, most of the training serves only to confuse the model when only aiming to solve a math equation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: