Examples for https://arcprize.org/play?task=0692e18c

Playing with ARC-AGI tests

Exercises Jan 19, 2026

Workroom PlayTime045

Exercise

15 minutes playing + 5 mins conversation

Kickoff

Demo – let’s pick an AGI-1 task from Explore All Tasks and do it together.

Do one yourself

Try (AGI-1 task) ARC-AGI Task #03560426 alone and keep a note of your hypotheses.

Work together

Try (AGI-1 / 2 task) ARC-AGI Task #12422b43 together – listen for hypotheses

Then let’s try to solve (AGI-1 / 2 task) ARC-AGI Task #136b0064 together.

If we've got time, we'll go back to solving solo / in pairs.

Debrief

5 mins – though we will be talking about most of this as we go. The following topics are suggestions if we feel quiet.

  • How did the tests evaluate reasoning?
  • Compare several tests. What is different in how they evaluate?
  • Does this 'feel' like a challenge to reasoning, or simply a sweet spot for tests?
  • Looking at the leaderboard, AGI-2 is a greater challenge to current models than AGI-1, and was developed as AGI-1's challenges were surmounted. Can this reflect a change in reasoning capability?

Extend

Only humans managed ARC-AGI Task #25094a63 and ARC-AGI Task #212895b5. Why might that be?

JL note: JL's machines have a further extension to 90/120 mins.

Tags

James Lyndsay

Getting better at software testing. Singing in Bulgarian. Staying in. Going out. Listening. Talking. Writing. Making.