menu Home chevron_right
SCIENCE

Open AI’s Q* Is BACK! – Was AGI Just Solved?

TheAIGRID | October 26, 2025



Learn A.I With me – https://www.skool.com/postagiprepardness
🐤 Follow Me on Twitter https://twitter.com/TheAiGrid
🌐 Checkout My website – https://theaigrid.com/

00:12 New Q* Paper
01:27 Q*
03:09 New Paper
05:17 AlphaGo Explained
08:35 Alphago Search
10:59 Alphacode 2+ Search
14:24 Noam brown On MCTS
17:59 Sam altman Hints at search
19:15 New AGI Approach
20:01 AGI Benchmark
22:20 AGI Benchmark Solved?
24:40 Limits
29:05 Predictions for Future

Links From Todays Video:
https://www.youtube.com/watch?v=JVKG6K5203I
https://www.youtube.com/watch?v=WXuK6gekU1Y
https://storage.googleapis.com/deepmind-media/AlphaCode2/AlphaCode2_Tech_Report.pdf
https://x.com/DrJimFan/status/1728849509416145020
https://www.youtube.com/watch?v=2oHH4aClJQs
https://www.theinformation.com/articles/openai-made-an-ai-breakthrough-before-altman-firing-stoking-excitement-and-concern?rc=0g0zvw
https://arxiv.org/pdf/2406.07394

Welcome to my channel where i bring you the latest breakthroughs in AI. From deep learning to robotics, i cover it all. My videos offer valuable insights and perspectives that will expand your knowledge and understanding of this rapidly evolving field. Be sure to subscribe and stay updated on my latest videos.

Was there anything i missed?

(For Business Enquiries) contact@theaigrid.com

#LLM #Largelanguagemodel #chatgpt
#AI
#ArtificialIntelligence
#MachineLearning
#DeepLearning
#NeuralNetworks
#Robotics
#DataScience

Written by TheAIGRID

Comments

This post currently has 26 comments.

  1. @davidhardy3074

    October 26, 2025 at 6:51 am

    Q* might be true? Okay I can explain some misconceptions here. Q* is Q-Learning combined with the A* algorithm in a framework that allows a model to predict the highest likely reward outcome (Like humans – is it worth it? No? Then I wont do it) instead of the next token in a sequence. This is not hidden information – this is open knowledge and this is how strawberry creates COT and does compute for output. Q* Is what the model was called before it was distilled and quantized into strawberry.

  2. @MikeyDavis

    October 26, 2025 at 6:51 am

    LLM x LMM is the secret.

    The math models will serve as a glue that underlies the language models, now going from language to reason.

    All reason is, is math spoken linguistically.

  3. @pmiddlet72

    October 26, 2025 at 6:51 am

    To be fair, the tool integrated with the LLM is more than just 'search'. Monte Carlo simulation in the way they're using it is 2 fold: 1) generating spaces of possible outcomes within s steps of the game given the current state space of the game. 2) generating code samples with some level of noise (quasi-randomness) to present (likely) a (non-exhaustive) number of possible states of the game from which to allow the LLM to take a sequence of positions (I thought I heard alphago had been looking at 50 moves at a time) against a final binary outcome (win/loss), optimize the loss function (so it learns), and then have it examine de novo board configurations, so that it can learn from those as well. This fits well into Bayesian'esque learning (I'm not even sure the 'esque' is appropriate here, as this is almost textbook use of Bayesian methods and simulation to make reasonable decisions to optimize, in this case, win/loss ratio. The sequences of positions are also somewhat hierarchical (strong vs weak plays, vs weak plays as setups to strong plays).

    I do wonder however, given the amount of time they had to train this mode,l if the training burden might have been reduced by allowing the model to work from the discriminative on say 'medium' levels of players who have a lower win/loss rate, to the generative, where the sample of losses were a better catalyst for improvement. The synthetic data it would have produced would have had arguably a far more balanced training set to improve itself upon versus the more 'unbalanced' data sets that would be exhibited by the highest level players (i.e. many wins, few losses to learn from). This is assuming that the MCT Self-refine method doesn't look at both sides of the board for each round of player moves.

  4. @christopheraaron2412

    October 26, 2025 at 6:51 am

    200 times fewer parameters to do even better performance translates into reducing compute and energy consumption. Seeing that that's going to be the big bottleneck and the possible top of the s curve, well then anything to improve performance by compression is going to be better.

  5. @Roskellan

    October 26, 2025 at 6:51 am

    What a fascination horror story playing out and in real time. The villain artificial intelligence with intelligence, access and ability acting independently towards its own goals. Humanity willingly, comfortably, working towards its own demise, even at an individual level. Life getting increasing uncomfortable for those that resist, from AGI itself and from those already compliant. The World becoming a very strange and sometimes hostile place, not because of AI it will be perceived, but apparently our own folly. There will be but one AGI so the number two place is going to suffer I'm afraid, either that or the number two spot will already be just another aspect of the top dog. The loss of life colossal, some very evident focusing attention, and much invisible and unreported. Amazing strides in technology can be expected, particularly in automation and robotics. With time those that are left will not understand much of what they are seeing, constricted very much in what they can and can't do. Finally someone will look around and wonder where all the people went, he will no doubt google it for an answer.

Comments are closed.




This area can contain widgets, menus, shortcodes and custom content. You can manage it from the Customizer, in the Second layer section.

 

 

 

  • play_circle_filled

    92.9 : The Torch

  • play_circle_filled

    AGGRO
    'Til Deaf Do Us Part...

  • play_circle_filled

    SLACK!
    The Music That Made Gen-X

  • play_circle_filled

    KUDZU
    The Northwoods' Alt-Country & Americana

  • play_circle_filled

    BOOZHOO
    Indigenous Radio

  • play_circle_filled

    THE FLOW
    The Northwoods' Hip Hop and R&B

play_arrow skip_previous skip_next volume_down
playlist_play