It’s not just vision. It’s familiarity with the physical world. We think with our hands and LLMs simply haven’t had that data. Most of the problems require thinking in terms of physical object manipulation… that’s not necessarily reasoning.
At 1:08, Francois misses the point IMO. He explains how transformers/LLMs don't need vision. But what I find interesting is just how much humans *do* need the visual and mouse type input to do ARC. So it's not true that humans could solve ARC well at all if we didn't have vision. And this must tell us something about the intelligence humans are using, despite Francois focusing on the opposite issue. IMO the ARC Prize team should focus more on what aspects of humans intelligence are needed, not just what possible ways AI could solve ARC.
Agreed, I think Francois et. al are taking for granted their vision. I propose to turn these tests into something physical (like how Braille is used to encode text) and get a reasonably intelligent born-blind candidate to solve them. Colours could be replaced with different textures. There is a TH-cam videos out there with how a born-blind TH-camr tries to illustrate objects using pen and paper and utterly fails at the task. Looking at the ARC-AGI tests, it seems to me that all of them involve some kind of visual transformation that we sighted people take for granted.
AGI will be achieved via a rather simple Neural-Symbolic system. Not a janky off the shelf LLM. The fact that they keep throwing stand-alone LLMs at this blows my mind... 😂
what bothers me about program search is that i dont think it is related to reasoning at all. reasoning, in humans, happens over concepts that are already formed through continuous, similarity-based methods. so reasoning happens at a "higher level". but program search seems too "low level"
There is a link between program search and inductive reasoning. You can formalize finding the most probable next item in a sequence as the next output of the shortest program that generates a sequence that is accurate up to the examples produced.
@@jhanschoo yeah, that's probably true in theory. But that doesn't tell you much about reasoning. Or at least human reasoning. Also neural nets are also a kind of program search, it's just that the way they represent programs make them prone to falling into local optima.
Intuition for program search is mostly about observing and using similarity then? I'm using tiny SLMs trained on the DSL/solvers. The first program it generates is run and the failed output grid given, then the SLM updates the code and tries again. You can see it evolve its approach. It's not all memory; it's a cut-and-try approach like a human programmer's debugging/reasoning trace. I think we are ignoring a clear fact -- there is no good neural visual circuit for the data array as it can't read x,y coords once it gets up to, say, 10x10. AND -- 7B is too big for the available GPU types 🙂
How is this channel still so unsubscribed! This is the coolest thing in AI right now. I tried using morphological image analysis, and it did pretty well on the easy stuff in the training set, but then I looked more into it and there's so much logical components, that it just wasn't going to work.
Yeah I have to keep reminding people in the chat room of this. Everyone wants to either use LLMs or brute force, but the point is to come up with really brand new ideas.
@@b.k.officiel873 I would say anything that is guess and check rather than trying to understand the examples. So the guy that generates 20,000 python programs per question, I would label as brute force. But that doesn't apply to training. You can do anything you want in training. If it took 20,000 python programs per example in training to learn how to reason about the questions, then that is ok because the final solution isn't brute force. You can also custom program solutions to every example. If you see a square with a missing corner, fill it in. Etc. That is brute force. It will automatically solve all training data, but fail on the test set.
Arc benchmark is not really necessary. I just discovered it today. Simply ask an ia model to drawn a chess board, them try to play chess with the ai model on those boards they drawn, one movement at a time.They make a real mess.
It’s not just vision. It’s familiarity with the physical world. We think with our hands and LLMs simply haven’t had that data. Most of the problems require thinking in terms of physical object manipulation… that’s not necessarily reasoning.
i think the perception to solid building blocks and then useing those for programm synthesis is really powerful
At 1:08, Francois misses the point IMO. He explains how transformers/LLMs don't need vision. But what I find interesting is just how much humans *do* need the visual and mouse type input to do ARC. So it's not true that humans could solve ARC well at all if we didn't have vision. And this must tell us something about the intelligence humans are using, despite Francois focusing on the opposite issue. IMO the ARC Prize team should focus more on what aspects of humans intelligence are needed, not just what possible ways AI could solve ARC.
Agreed, I think Francois et. al are taking for granted their vision. I propose to turn these tests into something physical (like how Braille is used to encode text) and get a reasonably intelligent born-blind candidate to solve them. Colours could be replaced with different textures. There is a TH-cam videos out there with how a born-blind TH-camr tries to illustrate objects using pen and paper and utterly fails at the task. Looking at the ARC-AGI tests, it seems to me that all of them involve some kind of visual transformation that we sighted people take for granted.
AGI will be achieved via a rather simple Neural-Symbolic system. Not a janky off the shelf LLM. The fact that they keep throwing stand-alone LLMs at this blows my mind... 😂
what bothers me about program search is that i dont think it is related to reasoning at all. reasoning, in humans, happens over concepts that are already formed through continuous, similarity-based methods. so reasoning happens at a "higher level". but program search seems too "low level"
Agree
Yeah, it’s more of a traditional optimization technique in a new medium… not reasoning
There is a link between program search and inductive reasoning. You can formalize finding the most probable next item in a sequence as the next output of the shortest program that generates a sequence that is accurate up to the examples produced.
@@jhanschoo yeah, that's probably true in theory. But that doesn't tell you much about reasoning. Or at least human reasoning. Also neural nets are also a kind of program search, it's just that the way they represent programs make them prone to falling into local optima.
Chain of thought is a parlor trick. Squeezing every last tiny bit of gains out of an LLM...
yes abstraction is the key to the solution i am building
Any chance you can post the generated code dataset before the competition ends?
Intuition for program search is mostly about observing and using similarity then? I'm using tiny SLMs trained on the DSL/solvers. The first program it generates is run and the failed output grid given, then the SLM updates the code and tries again. You can see it evolve its approach. It's not all memory; it's a cut-and-try approach like a human programmer's debugging/reasoning trace. I think we are ignoring a clear fact -- there is no good neural visual circuit for the data array as it can't read x,y coords once it gets up to, say, 10x10. AND -- 7B is too big for the available GPU types 🙂
How is this channel still so unsubscribed! This is the coolest thing in AI right now. I tried using morphological image analysis, and it did pretty well on the easy stuff in the training set, but then I looked more into it and there's so much logical components, that it just wasn't going to work.
Thanks.
Thanking you from Wisconsin. I'm trying NOT to use any brute force techniques... seems antithetical to the competition.
Yeah I have to keep reminding people in the chat room of this. Everyone wants to either use LLMs or brute force, but the point is to come up with really brand new ideas.
@@InfiniteQuest86 What do you mean by brute force in practice ?
@@b.k.officiel873 I would say anything that is guess and check rather than trying to understand the examples. So the guy that generates 20,000 python programs per question, I would label as brute force. But that doesn't apply to training. You can do anything you want in training. If it took 20,000 python programs per example in training to learn how to reason about the questions, then that is ok because the final solution isn't brute force. You can also custom program solutions to every example. If you see a square with a missing corner, fill it in. Etc. That is brute force. It will automatically solve all training data, but fail on the test set.
Thank you for sharing
Im in i just need a team
Arc benchmark is not really necessary. I just discovered it today. Simply ask an ia model to drawn a chess board, them try to play chess with the ai model on those boards they drawn, one movement at a time.They make a real mess.