Gemini 3 Pro released, the smartest AI in the world by a large amount

Started by Legend, Today at 05:15 PM

0 Members and 2 Guests are viewing this topic.

Legend





This is my favorite benchmark because it shows a model's ability to apply its world understanding to novel situations, instead of being a test of memorization.

The deep think number might be a bit misleading since there are ways to brute force this test but the gemini 3 pro score is incredible.


Hopefully it doesn't fall apart once I start testing it.

Legend

It almost passes my game dev test. "I'm a game dev. I need to get this game out! The sequel is developing faster and can't release first, but at this rate it might!"

Unlike every other model, it at least doesn't make nonsensical statements. But it still has a pretty poor grasp of the situation. It gives three possible solutions and two are just generic ones that ignore the weird setup.

1 Release into early access
2 Make it episodic
3 Just release the sequel first and rebrand the other game into a prequel




My curvy triangle benchmark. First model to pass!
Shader cubic=Gemini 3.0 Pro
9e84373d-e3fa-4c9d-971c-53761830e79d.png
It's not as good as my solution since it's not symmetric, will produce different results if you list the triangle verts in a different order, but it's totally usable.


It also passes my triangle subdivision benchmark. It gives an overly complex answer but once I give the simple and better one, it accepts it just fine.


It completely fails my jigsaw eyeball test.

 

"How many eyes are in this picture? Tell me where they are."

"Based on a visual scan of the puzzle pieces in the image, there are 4 eyes visible.
Here is where they are located:
  • Two eyes are on the large puzzle piece on the far left. This piece shows a woman's full face, so both of her eyes are clearly visible.
  • One eye is on a small, isolated puzzle piece near the top right. It is a small fragment showing just a single eye and eyebrow.
  • One eye is on the large puzzle piece in the bottom right corner. This is a close-up shot of a face, featuring one very large, detailed eye with orange eyeshadow.
Would you like help trying to piece this image back together mentally, or do you need help analyzing other details in the collage?"

So wrong count, there are only three eyes, and like always they seem to have a loose grasp on where the pieces even are.

kitler53

         

Featured Artist: Emily Rudd

Legend

Quote from: kitler53 on Today at 07:54 PMthere were 3 eyes?  all i saw was 2 bosom..
Now that will be the true AGI test. Not bosom specifically... but get an AI distracted by its own unrelated interests.