•

Terence Tao: GPT-O1 nearing "competent grad student" usefulness

Terence Tao (@tao@mathstodon.xyz)

https://mathstodon.xyz/@tao/113132502735585408

I have played a little bit with OpenAI's new iteration of GPT, GPT-o1, which performs an initial reasoning step before running the LLM. It is certainly a more capable tool than previous iterations, though still struggling with the most advanced research mathematical tasks. Here are some concrete experiments (with a prototype version of the model that I was granted access to). In https://chatgpt.com/share/2ecd7b73-3607-46b3-b855-b29003333b87 I repeated an experiment from https://mathstodon.xyz/@tao/109948249160170335 in which I asked GPT to answer a vaguely worded mathematical query which could be solved by identifying a suitable theorem (Cramer's theorem) from the literature. Previously, GPT was able to mention some relevant concepts but the details were hallucinated nonsense. This time around, Cramer's theorem was identified and a perfectly satisfactory answer was given. (1/3)

The experience seemed roughly on par with trying to advise a mediocre, but not completely incompetent, graduate student. However, this was an improvement over previous models, whose capability was closer to an actually incompetent graduate student. It may only take one or two further iterations of improved capability (and integration with other tools, such as computer algebra packages and proof assistants) until the level of "competent graduate student" is reached, at which point I could see this tool being of significant use in research level tasks.

16 comments

See all comments

NegentropicBoy

•

O1 is (apparently) different according to some videos I watched, as it pulls apart the question and does some reasoning steps.

aodhsishaj

•

I'd love to see one of those videos

jsomae

•

like, a video of Tao giving a demonstration?

aodhsishaj

•

@NegentropicBoy English20•

O1 is (apparently) different according to some videos I watched, as it pulls apart the question ...

Yes

technocrit

•

does some reasoning steps.

The people who believe in "AI" say the wackiest things.

jsomae

•

LLMs are basically just good pattern matchers. But just like how A* search can find a better path than a human can by breaking the problem down into simple steps, so too can an LLM make progress on an unsolved problem if it's used properly and combined with a formal reasoning engine.

I'm going to be real with you: the big insight behind almost all new mathematical ideas is based on the math that came before. Nothing is truly original the way AI detractors seem to believe.

By "does some reasoning steps," OpenAI presumably are just invoking the LLM iteratively so that it can review its own output before providing a final answer. It's not a new idea.

tee9000

•

Its what chaptgpt calls it.