What time is given on the clock?
I came across this pretty unbelievable (yet insightful) post on X today. It noted how poorly LLMs can tell time on analog clocks. If it turned out to be true that LLMs are this bad at simply telling the time, then surely our predictions of fast-approaching AGI might be a little overblown, at least in terms of the LLMs' ability to make sense of their environments or whatever novel data we throw at them.
This benchmark does a very good job of demonstrating how poorly the current generation of models generalize to data they haven't been extensively trained on. There's room for lots more innovation yet! https://t.co/npDFCItJEq
— Maxime Chevalier (@Love2Code) September 8, 2025
So I decided to try this out for myself, with a random clock image of what clearly appears to be 10:12 and 16 seconds (or 10:12:16).
Claude Sonnet 4 gave me 10:10.

ChatGPT 5 gave me 10:11:46.

Grok Fast gave me 7:10.

Grok Auto gave me 12:00.

Grok Expert gave me 10:10:30.

Manus AI gave me 9:13.

We have a long way to go!