Gemini 3 vs. Claude Opus 4.5 vs. GPT-5.1 Codex: Which AI model is the best designer?
How I AI3 Joulu 2025

Gemini 3 vs. Claude Opus 4.5 vs. GPT-5.1 Codex: Which AI model is the best designer?

I put three cutting-edge AI models to the test in a head-to-head design competition. Using the exact same prompt, I challenged Google’s Gemini 3, Anthropic’s Opus 4.5, and OpenAI’s Codex 5.1 to redesign my blog page, evaluating them on visual design quality, user experience improvements, and SEO optimization capabilities. One model produced a beautiful, polished, production-ready redesign. One was fine. And one completely whiffed. If you’re trying to figure out where each model fits in your workflow—design, planning, back-end, or something else—this episode will save you a lot of trial and error.


What you’ll learn:

  1. How each AI model approaches the same design challenge differently
  2. Why planning capabilities dramatically impact design quality
  3. The specific visual and functional improvements each model made
  4. Which model excels at front-end design versus back-end functionality
  5. How to strategically choose the right AI model for different parts of your workflow
  6. The importance of model-switching based on specific use cases

Blog design: https://www.chatprd.ai/blog

Brought to you by:

Lovable—Build apps by simply chatting with AI

Where to find Claire Vo:

ChatPRD: https://www.chatprd.ai/

Website: https://clairevo.com/

LinkedIn: https://www.linkedin.com/in/clairevo/

X: https://x.com/clairevo

In this episode, we cover:

(00:00) Introduction to the AI design challenge

(01:25) The question: Which model is the better designer?

(03:08) The prompt used for all three models

(04:10) Gemini 3 Pro’s approach and results

(06:00) Opus 4.5’s approach and results

(10:54) Codex 5.1’s approach and disappointing results

(14:51) Comparing the three designs side by side

(16:03) Analyzing the change logs and SEO improvements from each model

(22:43) Final verdict

(23:00) Conclusion and next steps

Tools referenced:

• Gemini 3 Pro: https://deepmind.google/models/gemini/pro/

• Anthropic Opus 4.5: https://www.anthropic.com/news/claude-opus-4-5

• OpenAI Codex 5.1: https://platform.openai.com/docs/models/gpt-5.1-codex

• Cursor: https://cursor.com/

Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email jordan@penname.co.

Jaksot(65)

How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia

How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia

Chintan Turakhia is Senior Director of Engineering at Coinbase, where he’s led the transformation of a 1,000-plus-engineer organization to embrace AI tools at scale. When tasked with rewriting Coinbas...

2 Maalis 58min

5 OpenClaw agents run my home, finances, and code | Jesse Genet

5 OpenClaw agents run my home, finances, and code | Jesse Genet

Jesse Genet is a homeschooling parent and entrepreneur who runs her household with five specialized OpenClaw agents. She layers them on top of her Obsidian “second brain,” deploys each on its own Mac ...

25 Helmi 49min

“I haven’t written a single line of front-end code in 3 months”: How Notion’s design team uses Claude Code to prototype

“I haven’t written a single line of front-end code in 3 months”: How Notion’s design team uses Claude Code to prototype

Brian Lovin is a designer at Notion AI who has transformed how the design team builds prototypes, by creating a shared code environment powered by Claude Code. Instead of designers working in isolated...

23 Helmi 51min

How this visually impaired engineer uses Claude Code to make his life more accessible | Joe McCormick

How this visually impaired engineer uses Claude Code to make his life more accessible | Joe McCormick

Joe McCormick is a principal software engineer at Babylist who lost most of his central vision due to a rare genetic disorder right before starting college. He pivoted from mechanical engineering to c...

16 Helmi 49min

Claude Opus 4.6 vs. GPT-5.3 Codex: How I shipped 93,000 lines of code in 5 days

Claude Opus 4.6 vs. GPT-5.3 Codex: How I shipped 93,000 lines of code in 5 days

I put the newest AI coding models from OpenAI and Anthropic head-to-head, testing them on real engineering work I’m actually doing. I compare GPT-5.3 Codex with Opus 4.6 (and Opus 4.6 Fast) by asking ...

11 Helmi 30min

How to build your own AI developer tools with Claude Code | CJ Hess (Tenex)

How to build your own AI developer tools with Claude Code | CJ Hess (Tenex)

CJ Hess is a software engineer at Tenex who has built some of the most useful tools and workflows for being a “real AI engineer.” In this episode, CJ demonstrates his custom-built tool, Flowy, that tr...

9 Helmi 53min

Guillermo Rauch: Vercel CEO on how v0 hit 3,200 PRs merged per day (and lets anyone ship)

Guillermo Rauch: Vercel CEO on how v0 hit 3,200 PRs merged per day (and lets anyone ship)

Guillermo Rauch, the CEO of Vercel, demonstrates how v0 has evolved from a simple prototyping tool to a complete development environment that supports the entire Git workflow. Guillermo shows how Verc...

4 Helmi 43min

How this PM uses MCPs to automate his meeting prep, CRM updates, and customer feedback synthesis | Reid Robinson (Zapier)

How this PM uses MCPs to automate his meeting prep, CRM updates, and customer feedback synthesis | Reid Robinson (Zapier)

Reid Robinson, Principal AI Product Strategist at Zapier, shares how he uses Model Context Protocols (MCPs) to automate tedious tasks and create powerful workflows. He demonstrates practical workflows...

2 Helmi 40min