My go-to is "Alice has 3 brothers and also has 6 sisters. How many sisters does her brother have?". They all say 6!
This test is nice because, as it's numeric, you can vary it slightly and test it easily across multiple APIs.
I believe I first saw this prompt in that paper two years ago that tested many AI models and found them all wanting.
o4-mini got this right 4 times out of 4.
Wow, I would not have expected frontier models to be caught on something like this but I tried it and they absolutely do. I don't really have a great explanation for why they might have such a hard time with this either. I even tried varying the prompt to make it "easier" by pointing out that Alice was a girl and no change.
Reasoning models seem to answer this correctly (tested Gemini pro 2.5, ChatGPT + reason). Grok answered correctly even without turning on "Thinking".
> I would not have expected frontier models to be caught on something like this but I tried it and they absolutely do.
This got me curious. Tried on Gemini 2.5 Pro out of curiosity. It gets it right, but it did think about it, and then even explain itself a lot in its official text response.
Thinking:
Here's my thought process for solving this classic riddle: Identify the key players: The problem mentions Alice, her brothers, and her sisters. Establish Alice's family structure: Alice is a girl. She has 3 brothers. She has 6 sisters. Determine the total number of children in the family: Girls: Alice (1) + her sisters (6) = 7 girls Boys: Alice's brothers (3) = 3 boys Total children: 7 girls + 3 boys = 10 children Focus on the question: "How many sisters does her brother have?" Shift perspective: Now, consider one of Alice's brothers (let's call him Bob, although his name doesn't matter). Analyze Bob's siblings: Bob is one of the children in the family. The family consists of 7 girls and 3 boys. Count Bob's sisters: From Bob's perspective, all the girls in the family are his sisters. How many girls are in the family? We already determined there are 7 girls (Alice + her 6 sisters). Count Bob's brothers: From Bob's perspective, the other boys in the family are his brothers. How many boys are in the family? 3. How many other boys are there besides Bob? 3 total boys - 1 (Bob himself) = 2 brothers. (This isn't asked, but it helps confirm the perspective shift). Answer the specific question: The question asks how many sisters Bob (Alice's brother) has. We found there are 7 girls in the family. All 7 girls are his sisters. Final Check: Does the answer make sense? Yes. All the children in a family share the same set of siblings, just categorized differently based on their own gender. All the boys have the same number of sisters, and all the girls have the same number of sisters. Alice has 6 sisters (the other 6 girls). Any brother has 7 sisters (Alice + the other 6 girls).
Final message: Here's how to break down the problem: 1. Identify the total number of girls in the family: Alice (1) + her sisters (6) = 7 girls. 2. Identify the total number of boys in the family: Alice has 3 brothers. 3. Consider one of Alice's brothers: Any one of her brothers is part of the same family. 4. Count his sisters: From the brother's perspective, all the girls in the family are his sisters. So, her brother has 7 sisters.