Using the Smartest AI to Rate Other AI

Using the Smartest AI to Rate Other AI

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels.

I talk about:

1. Using One AI to Evaluate Another
The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review.

2. A Human-Centric Grading System
Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected.

3. Custom Prompts That Push for Deeper Evaluation
The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance.

Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Avsnitt(532)

Unsupervised Learning: Episode 42

Unsupervised Learning: Episode 42

[ Subscribe to the Podcast: iTunes | Android | RSS ] InfoSec news and articles Dropbox hacked 68 million accounts Back in 2012 Malware infected all Eddie Bauer stores in U.S. and Canada All 350 stores in North America Wicked iPhone vulnerability called Trident (3 0days) All you need to do is follow a link, […] -- :: Unsupervised Learning: Episode 42 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

1 Sep 20161h 4min

Unsupervised Learning: Episode 41

Unsupervised Learning: Episode 41

[ Subscribe to the Podcast: iTunes | Android | RSS ] InfoSec news and articles NSA hacking tools supposedly leaked back in 2013 Could have just been a jump box, which rival groups commonly attack from each other Snowden thinks Russia hacked the NSA and is announcing this as part of the DNC debate Flip […] -- :: Unsupervised Learning: Episode 41 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

18 Aug 201634min

Unsupervised Learning: Episode 40

Unsupervised Learning: Episode 40

- LinkedIn breach from 2013 | 65.5 million emails and salted and hashed passwords - XSS in Wordpress plugin (JetPack) - DerbyCon is going to stream live this year | you can’t stream the networking, so it probably won’t hurt next year’s sales too much - Websites using audio fingerprinting to track web usersBecome a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

31 Maj 201654min

Unsupervised Learning: Episode 39

Unsupervised Learning: Episode 39

[ Subscribe to the Podcast: iTunes | Android | RSS ] InfoSec news and articles BAE systems saying that SWIFT hack is linked to the Sony breach [ Link ] Kaspersky is saying ransomware is the #1 threat now [ Link ] Identity thieves grab W-2 data from Equinox [ Link ] Germany claims it was […] -- :: Unsupervised Learning: Episode 39 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

14 Maj 201623min

Unsupervised Learning: Episode 38

Unsupervised Learning: Episode 38

[ Subscribe to the Podcast: iTunes | Android | RSS ] InfoSec news and articles Michigan lawmakers want life sentence for hacking cars | will that apply to changing the speed of your turn signal? SWIFT to get update after Bangladesh hack NSA is so overwhelmed with data that it’s no longer effective FBI now […] -- :: Unsupervised Learning: Episode 38 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

2 Maj 201645min

Unsupervised Learning: Episode 37

Unsupervised Learning: Episode 37

[ Subscribe to the Podcast: iTunes | Android | RSS ] InfoSec news Feds paid over 1M to get into San Bernardino iPhone Continued fallout from Panama papers 3.2 million servers vulnerable to JBoss attack which is being used in SamSam ransomware attacks MIT launches internal bug bounty platform | https://threatpost.com/mit-launches-experimental-bug-bounty-program/117618/ NSA recommends out-of-band taps […] -- :: Unsupervised Learning: Episode 37 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

25 Apr 201635min

Unsupervised Learning: Episode 36

Unsupervised Learning: Episode 36

[ Subscribe to the Podcast: iTunes | Android | RSS ] News [ ] Nothing useful found on Farook’s phone | http://www.theregister.co.uk/2016/04/14/nothing_useful_on_farook_iphone/?utm_source=dlvr.it&utm_medium=facebook | I think they knew this and used it as a lever for something they’ve wanted for a long time [ ] Apple engineers say security threat is hackers, not government | http://www.macrumors.com/2016/04/15/apple-engineers-hackers-security-threat/ […] -- :: Unsupervised Learning: Episode 36 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

18 Apr 201620min

Unsupervised Learning: Episode 35

Unsupervised Learning: Episode 35

[ Subscribe to the Podcast: iTunes | Android | RSS ] News [ ] The hack of Mossak Fonseca has been tied to a breach of their wordpress install through a plugin called Revolution Slider, leading to the Panama Papers breach. So just to be clear, we might have just seen the biggest data leak […] -- :: Unsupervised Learning: Episode 35 appeared originally on danielmiessler.com. :: Subscribe to Unsupervised Learning---my weekly show where I handpick the best stories from infosec and technology, and talk about why they matter.Become a Member: https://danielmiessler.com/upgradeSee omnystudio.com/listener for privacy information.

11 Apr 201627min

Populärt inom Teknik

uppgang-och-fall
elbilsveckan
rss-racevecka
bilar-med-sladd
market-makers
skogsforum-podcast
rss-laddstationen-med-elbilen-i-sverige
rss-technokratin
natets-morka-sida
rss-elektrikerpodden
developers-mer-an-bara-kod
mediepodden
ai-sweden-podcast
rss-uppgang-och-fall
solcellskollens-podcast
hej-bruksbil
bli-saker-podden
rss-it-sakerhetspodden
rss-veckans-ai
rss-fabriken-2