POETIQ’S AI BREAKTHROUGH SMASHES REASONING BARRIERS

A tiny six-person startup called Poetiq has stunned the AI world by topping the tough ARC-AGI-2 benchmark with 54% accuracy, beating Google's Gemini 3 Deep Think at half the cost—just $30 per task versus $77. Six months back, top models scraped by at under 5%, but Poetiq's clever "meta-system" leaped past 50%, but not surpassing average human scores around 60%.

Poetiq didn't build massive new models from scratch. Instead, their innovative engineering layers a smart orchestrator on existing AI like Gemini 3 or GPT-5.1, adapting in hours without retraining—it picks model combos, writes code when needed, and refines solutions step-by-step.

This self-improving system uses large language models to audit their own work, generating ideas, testing them, and iterating until perfect, all while slashing costs and waste. Open-sourced on GitHub, it works across model families, proving small teams can rival giants through smart design.

The real utility shines in making powerful AI accessible — businesses and researchers can boost any model's reasoning for complex puzzles like pattern detection without huge budgets or delays. Poetiq redraws the map: engineering smarts now drive AI leaps as much as raw scale.

CLEVER ORCHESTRATION > ENDLESS COMPUTE!

Scroll to Top