Using the Smartest AI to Rate Other AI

Using the Smartest AI to Rate Other AI

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels.

I talk about:

1. Using One AI to Evaluate Another
The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review.

2. A Human-Centric Grading System
Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected.

3. Custom Prompts That Push for Deeper Evaluation
The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance.

Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Avsnitt(532)

News & Analysis | No. 297

News & Analysis | No. 297

The latest in Security News, Technology News, Human News, Ideas Trends & Analysis, Discovery, Recommendations, and the Weekly Aphorism… Web Version: https://danielmiessler.com/podcast/news-analysis-no-297/Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

7 Sep 202119min

News & Analysis | No. 296

News & Analysis | No. 296

The latest in Security News, Technology News, Human News, Ideas Trends & Analysis, Discovery, Recommendations, and the Weekly Aphorism… Web Version: https://danielmiessler.com/podcast/news-analysis-no-296/Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

30 Aug 202117min

A Sponsored Lunch Conversation with Philippe Humeau of CrowdSec

A Sponsored Lunch Conversation with Philippe Humeau of CrowdSec

This is a series where we emulate a first-time business lunch with a vendor/entrepreneur, where you can hear the pitch and ask all your basic questions about the offering. What you hear is exactly the type of conversation I would have with someone in real life during a 30 minute lunch chat where I’m hearing about the solution for the first time.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

25 Aug 202127min

News & Analysis | No. 295

News & Analysis | No. 295

The latest in Security News, Technology News, Human News, Ideas Trends & Analysis, Discovery, Recommendations, and the Weekly Aphorism… Web Version: https://danielmiessler.com/podcast/news-analysis-no-295/Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

23 Aug 202112min

News & Analysis | No. 294

News & Analysis | No. 294

The latest in Security News, Technology News, Human News, Ideas Trends & Analysis, Discovery, Recommendations, and the Weekly Aphorism… Web Version: https://danielmiessler.com/podcast/news-analysis-no-294/Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

16 Aug 202111min

News & Analysis | No. 293

News & Analysis | No. 293

The latest in Security News, Technology News, Human News, Ideas Trends & Analysis, Discovery, Recommendations, and the Weekly Aphorism… Web Version: https://danielmiessler.com/podcast/news-analysis-no-293/Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

9 Aug 202117min

The Strange World of "Good Enough" Fencing

The Strange World of "Good Enough" Fencing

How bad does a fence have to be before it stops being effective?Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

5 Aug 20215min

News & Analysis | No. 292

News & Analysis | No. 292

The latest in Security News, Technology News, Human News, Ideas Trends & Analysis, Discovery, Recommendations, and the Weekly Aphorism… Web Version: https://danielmiessler.com/podcast/news-analysis-no-292/Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

2 Aug 202124min

Populärt inom Teknik

uppgang-och-fall
market-makers
elbilsveckan
rss-badfluence
rss-racevecka
rss-laddstationen-med-elbilen-i-sverige
natets-morka-sida
rss-technokratin
skogsforum-podcast
rss-elektrikerpodden
hej-bruksbil
rss-uppgang-och-fall
bilar-med-sladd
garagehang
developers-mer-an-bara-kod
solcellskollens-podcast
rss-digitala-influencer-podden
rss-veckans-ai
har-vi-akt-till-mars-an
rss-snacka-om-ai