Explore ARC-AGI Data + Play

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ก.ย. 2024

ความคิดเห็น • 38

  • @s.dotmedia
    @s.dotmedia 3 หลายเดือนก่อน +6

    Let's go! I'm all in on this, I will say: don't count out the power of one shot

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน +2

      Nice! Love it - let us know if you need anything along the way

  • @omarnomad
    @omarnomad 3 หลายเดือนก่อน +8

    Is there a way to know all the priors you embed into the puzzles?
    So far I’ve identified:
    1. Translations - Shifting objects or patterns across the grid.
    2. Rotations - Rotating objects or patterns at different angles.
    3. Reflections - Flipping objects or patterns across a line.
    4. Scaling - Changing the size of objects or patterns.
    5. Repetition and symmetry - Repeating patterns or creating symmetrical designs.
    6. Color changes - Altering the color of objects or patterns.
    7. Compositions - Combining multiple operations or transformations.
    8. Object addition or removal - Adding or removing elements within the grid.
    9. Changes of the size matrices - Modifying the dimensions of the grid or the objects within it.

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน +1

      There have been a bunch of attempts at this.
      Table 4 on this paper leans that direction
      arxiv.org/pdf/2403.11793
      There isn't a way to know all the priors, this is essentially helping give the answer to the test set

    • @googleyoutubechannel8554
      @googleyoutubechannel8554 27 วันที่ผ่านมา

      Oh you sweet summer child...

  • @InfiniteQuest86
    @InfiniteQuest86 3 หลายเดือนก่อน +10

    Thank you! I know this has been around for a while, but I'm happy to see a legitimate attempt at testing intelligence that isn't "It passed the Turing test." LLMs sound smart because they speak our language, but are they really doing anything more than regurgitating memorized information? This test shows that most likely not really.

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน +1

      Thanks! yes we agree

  • @DistortedV12
    @DistortedV12 3 หลายเดือนก่อน +4

    I have an idea now, thanks. I’ll probably check out ARC after my PhD qualifying exam. Finetuning is gonna be fun 🤩

  • @RPHacker777
    @RPHacker777 2 หลายเดือนก่อน +4

    Could someone please explain how the AI soccer players in a simulation can go from physically flopping around on the ground to teaching themselves team strategy but AI can't solve these ARC tasks?

  • @mriz
    @mriz 3 หลายเดือนก่อน +5

    thx for demonstrations, this taks feel like arbitrarily single step arbitrary state transition in cellular automaton. It also looks like fun to play 😄

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน

      Nice! Yes please go try it out and let us know what you think

  • @D3cast
    @D3cast 2 หลายเดือนก่อน +1

    I am new to programming, but this challenge and task really interests me and I'd like to give it a try. Could you create a tutorial on how to submit an entry to the arc challenge? maybe with a model which will produce some minimal results?

    • @ARCprize
      @ARCprize  2 หลายเดือนก่อน +1

      Totally! We have a ton of templates here
      arcprize.org/guide
      As for a submission tutorial, we don't have a video of this directly, but this video shows how to work with Kaggle notebooks.
      th-cam.com/video/crhrzhVjWog/w-d-xo.html

  • @anantkeepershome4327
    @anantkeepershome4327 3 หลายเดือนก่อน

    When you think about it, the optimal network should be like a physics simulator, every example has its own stable rules. My guess is that a recurrent network would have the best chance. Though the parameter count would need to be huge so we could perhaps make a Hypernet to generate the weights from scratch.

  • @mosca204
    @mosca204 2 หลายเดือนก่อน +1

    Why is the train/evaluation set so small?

    • @ARCprize
      @ARCprize  2 หลายเดือนก่อน

      The tasks are handmade which limit the scale that can be done.
      They focus on diversity rather than quantity at this stage

  • @robbielualhati1731
    @robbielualhati1731 3 หลายเดือนก่อน

    There are also birds such as Sulphur Crested Cockatoos that have shown problem solving skills. Hopefully it's proof enough that a basic reasoning model won't require a trillion parameters.

  • @alvaromros8127
    @alvaromros8127 2 หลายเดือนก่อน

    Does your submission count if you make use of private models like gpt4 at some point in your algorithm?

  • @ignaciosavi7739
    @ignaciosavi7739 3 หลายเดือนก่อน

    Lets get to the bottom of this. How much for getting 90 % accuracy on a free llm model? How much do i get for that?

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน

      The threshold for a Kaggle score is 85%, reach that with a valid submission and you're eligible for a prize

    • @ignaciosavi7739
      @ignaciosavi7739 3 หลายเดือนก่อน

      @@ARCprize thanks

  • @geospatialindex
    @geospatialindex 3 หลายเดือนก่อน

    So have you collaborated with any psychologists to make this test

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน

      Check out section 11.1 of Measure/Intelligence. Francois digs into his influence of human psychology

  • @Aemond-qj4xt
    @Aemond-qj4xt 3 หลายเดือนก่อน

    i think i might have unintentionally set the basis for solving this in a project i did a couple months ago

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน

      We'd love to see a submission!

    • @Aemond-qj4xt
      @Aemond-qj4xt 3 หลายเดือนก่อน

      @@ARCprize working on it i just handed in my graduation project i have time to work on this now

  • @ManwithNoName-t1o
    @ManwithNoName-t1o 3 หลายเดือนก่อน +3

    children can solve these puzzles but i dont think LLM's can

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน +1

      We haven't seen LLM do this yet

    • @shure-youtube
      @shure-youtube 3 หลายเดือนก่อน

      @@ARCprize How about VLM? I think this task requires strong spatial understanding.

    • @ignaciosavi7739
      @ignaciosavi7739 3 หลายเดือนก่อน

      How?​@@ARCprize

  • @clerothsun3933
    @clerothsun3933 13 วันที่ผ่านมา

    I get that this is a stepping stone, but calling it a test for AGI is just ludicrous. This isn't even close to AGI, it's just a toy.

  • @JirkaKlimes_
    @JirkaKlimes_ 3 หลายเดือนก่อน

    It can't be that hard right

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน +1

      Try it out! We'd love to see a submission

  • @ps3301
    @ps3301 3 หลายเดือนก่อน +3

    If u cant design an ai architecture to solve this problem, u arent as smart as you think.

    • @denisblack9897
      @denisblack9897 3 หลายเดือนก่อน

      Hot take)
      Don’t forget to design great design to sell subscriptions😅

  • @geospatialindex
    @geospatialindex 3 หลายเดือนก่อน

    Sorry this isn’t general intelligence. This is just reasoning. It is painful watching a whole industry trying to reinvent psychology when there is shady a century of research there.

    • @ARCprize
      @ARCprize  3 หลายเดือนก่อน

      Thanks for the comment! We'd love to hear your ideas and thoughts about how to get closer to AGI