SCIENCE

Open AI’s Q* Is BACK! – Was AGI Just Solved?

TheAIGRID | October 26, 2025

Learn A.I With me – https://www.skool.com/postagiprepardness
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website – https://theaigrid.com/

00:12 New Q* Paper
01:27 Q*
03:09 New Paper
05:17 AlphaGo Explained
08:35 Alphago Search
10:59 Alphacode 2+ Search
14:24 Noam brown On MCTS
17:59 Sam altman Hints at search
19:15 New AGI Approach
20:01 AGI Benchmark
22:20 AGI Benchmark Solved?
24:40 Limits
29:05 Predictions for Future

Links From Todays Video:
https://www.youtube.com/watch?v=JVKG6K5203I
https://www.youtube.com/watch?v=WXuK6gekU1Y
https://storage.googleapis.com/deepmind-media/AlphaCode2/AlphaCode2_Tech_Report.pdf
https://x.com/DrJimFan/status/1728849509416145020
https://www.youtube.com/watch?v=2oHH4aClJQs
https://www.theinformation.com/articles/openai-made-an-ai-breakthrough-before-altman-firing-stoking-excitement-and-concern?rc=0g0zvw
https://arxiv.org/pdf/2406.07394

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries) contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

Written by TheAIGRID

Comments

This post currently has 26 comments.

@randallowen9350

October 26, 2025 at 6:51 am

I want to know, can AI locate the Holy Grail? Asking for my neighbor, old Mr Jones.
@tobywoolsey7844

October 26, 2025 at 6:51 am

A lot of people already think that is how they created o1!
@davidhardy3074

October 26, 2025 at 6:51 am

Q* might be true? Okay I can explain some misconceptions here. Q* is Q-Learning combined with the A* algorithm in a framework that allows a model to predict the highest likely reward outcome (Like humans – is it worth it? No? Then I wont do it) instead of the next token in a sequence. This is not hidden information – this is open knowledge and this is how strawberry creates COT and does compute for output. Q* Is what the model was called before it was distilled and quantized into strawberry.
@turtleanton6539

October 26, 2025 at 6:51 am

Hmmmm🎉
@SherrifOfNottingham

October 26, 2025 at 6:51 am

I'm gonna have to stop and ask what the hell a LLM has to do with AGI
@theJellyjoker

October 26, 2025 at 6:51 am

Makes sense that math shrinks your brain lol
@ChristianIce

October 26, 2025 at 6:51 am

AGI is basically Text Prediction + Magic.
We didn't solve the "magic" factor yet 😀
@j.d.4697

October 26, 2025 at 6:51 am

Such confused rambling…
@AskingA.I

October 26, 2025 at 6:51 am

thank you
@JeremyMone

October 26, 2025 at 6:51 am

truly truly truly truly truly truly truly truly truly truly truly truly truly truly truly truly truly truly truly….
@phillmeredith

October 26, 2025 at 6:51 am

Clicks “don’t recommend this channel”
@abenjamin13

October 26, 2025 at 6:51 am

Nothing is solved.
@endoflevelboss

October 26, 2025 at 6:51 am

Sounds like a child struggling to explain technology to himself using words like "you know" and "basically" as a substitute for vocabulary.
@internettoughguy5943

October 26, 2025 at 6:51 am

Can’t believe you’re all falling for this AI scam 😂
@enduringwave87

October 26, 2025 at 6:51 am

TRULY TRULY
RABBI SHMULEY

Standup Ovation for me for this POEM 👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏👏
@GianniLeon

October 26, 2025 at 6:51 am

Look up QSM From Gianni Tech Genesis. This is Q*
@JudahCrowe-ej9yl

October 26, 2025 at 6:51 am

Compression is intelligence.
Q* it isn't
@MikeyDavis

October 26, 2025 at 6:51 am

LLM x LMM is the secret.

The math models will serve as a glue that underlies the language models, now going from language to reason.

All reason is, is math spoken linguistically.
@pmiddlet72

October 26, 2025 at 6:51 am

To be fair, the tool integrated with the LLM is more than just 'search'. Monte Carlo simulation in the way they're using it is 2 fold: 1) generating spaces of possible outcomes within s steps of the game given the current state space of the game. 2) generating code samples with some level of noise (quasi-randomness) to present (likely) a (non-exhaustive) number of possible states of the game from which to allow the LLM to take a sequence of positions (I thought I heard alphago had been looking at 50 moves at a time) against a final binary outcome (win/loss), optimize the loss function (so it learns), and then have it examine de novo board configurations, so that it can learn from those as well. This fits well into Bayesian'esque learning (I'm not even sure the 'esque' is appropriate here, as this is almost textbook use of Bayesian methods and simulation to make reasonable decisions to optimize, in this case, win/loss ratio. The sequences of positions are also somewhat hierarchical (strong vs weak plays, vs weak plays as setups to strong plays).

I do wonder however, given the amount of time they had to train this mode,l if the training burden might have been reduced by allowing the model to work from the discriminative on say 'medium' levels of players who have a lower win/loss rate, to the generative, where the sample of losses were a better catalyst for improvement. The synthetic data it would have produced would have had arguably a far more balanced training set to improve itself upon versus the more 'unbalanced' data sets that would be exhibited by the highest level players (i.e. many wins, few losses to learn from). This is assuming that the MCT Self-refine method doesn't look at both sides of the board for each round of player moves.
@cstuck33

October 26, 2025 at 6:51 am

Constructive feedback:
I would watch more of your videos if they were a little shorter
@christopheraaron2412

October 26, 2025 at 6:51 am

200 times fewer parameters to do even better performance translates into reducing compute and energy consumption. Seeing that that's going to be the big bottleneck and the possible top of the s curve, well then anything to improve performance by compression is going to be better.
@Sanshirowatanabe

October 26, 2025 at 6:51 am

Pretty, pretty shocking
@Roskellan

October 26, 2025 at 6:51 am

What a fascination horror story playing out and in real time. The villain artificial intelligence with intelligence, access and ability acting independently towards its own goals. Humanity willingly, comfortably, working towards its own demise, even at an individual level. Life getting increasing uncomfortable for those that resist, from AGI itself and from those already compliant. The World becoming a very strange and sometimes hostile place, not because of AI it will be perceived, but apparently our own folly. There will be but one AGI so the number two place is going to suffer I'm afraid, either that or the number two spot will already be just another aspect of the top dog. The loss of life colossal, some very evident focusing attention, and much invisible and unreported. Amazing strides in technology can be expected, particularly in automation and robotics. With time those that are left will not understand much of what they are seeing, constricted very much in what they can and can't do. Finally someone will look around and wonder where all the people went, he will no doubt google it for an answer.
@casperaleksandersen2711

October 26, 2025 at 6:51 am

Please write a script using GPT and follow it next time.
@Lucastos311

October 26, 2025 at 6:51 am

Why does an LLM need to know math? Why can't the LLM just connect to the most advanced calculator and them reply with what the calculator said?
@Recumbent_IT

October 26, 2025 at 6:51 am

I suppose the industry is SHOCKED.