Unsupervised Learning19 Huhti 2025

Using the Smartest AI to Rate Other AI

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels.

I talk about:

1. Using One AI to Evaluate Another
The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review.

2. A Human-Centric Grading System
Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected.

3. Custom Prompts That Push for Deeper Evaluation
The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance.

Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(541)

AI The Creative Workflow & The Dangers of Groupthink

A new study shows that while generative AI like ChatGPT makes individual stories more creative and engaging, it also makes them more similar to each other. | by Ben Dickson | MORE Subscribe to the new...

31 Elo 20245min

UL NO. 447: Sam Curry on Bug Bounty Careers, Slack Data Exfil, The Work Lie

Stopping Chinese AI/Robot imports, Substrate for political platforms, sun vs. smoking, and more... Subscribe to the newsletter at: https://danielmiessler.com/subscribe Join the UL community at:https:/...

31 Elo 202432min

Don’t Judge Yourself Based On What Companies Think of Your Skills

I watched a number of videos last night about people losing their jobs, starting a YouTube channel, and just generally struggling. People are hurting because they’re feeling the ground shifting under ...

29 Elo 20244min

Microsoft Fires DEI Team & The Correct Approach To Diversity

Microsoft Lays Off DEI Team — Microsoft laid off its diversity, equity, and inclusion team, saying DEI is "no longer business critical." MORE Subscribe to the newsletter at: https://danielmiessler.com...

27 Elo 20242min

UL NO. 446: AI Ecosystem Components, MS 0-Days, Iranian Campaign Hacks…

Political deepfakes are here, Grok2 is insane, weakness vs. evil, and more… Check out ThreatLocker to secure your data: threatlocker.com/ul Subscribe to the newsletter at: https://danielmiessler.com/...

22 Elo 202442min

Introducing Substrate—An Open-source Framework for Human Understanding, Meaning, and Progress

This episode introduces Substrate—An Open-source Framework for Human Understanding, Meaning, and Progress. Substrate is a crowdsourced project designed to enhance understanding, communication, and ac...

9 Elo 202441min

UL NO. 444: Pizza Meter Intelligence, China Bypasses Bans, Securing AWS Secrets…

What to expect at Blackhat/DEFCON, Identifying Explosives, OpenAI's new models, Llama 4 Timeline, and more… ➡ Check out Vanta and get $1000 off:vanta.com/unsupervised Subscribe to the newsletter at: ...

9 Elo 202424min

Scaling Misinformation With AI

Daniel Miessler discusses how AI can grow the number of elite propagandists and hackers employed by foreign intelligence agencies. Discussed in this video: AI-Enhanced Software and Disinformation (00:...

7 Elo 20245min

Kaikki yhdessä sovelluksessa

Kuuntele kaikki suosikkipodcastisi ja -äänikirjasi yhdessä paikassa.

Sinulle valikoitua sisältöä

Podme-sovelluksessa kokoat suosikkisi helposti omaan kirjastoosi. Saat meiltä myös kuuntelusuosituksia!

Jatka kuuntelua koska tahansa

Voit jatkaa siitä mihin jäit, myös offline-tilassa.

Tarinat ja äänet, joita rakastat kuunnella

Kuuntele kaikki suosikkipodcastisi ja -äänikirjasi

Lue lisää