#147 – Spencer Greenberg on stopping valueless papers from getting into top journals

#147 – Spencer Greenberg on stopping valueless papers from getting into top journals

Can you trust the things you read in published scientific research? Not really. About 40% of experiments in top social science journals don't get the same result if the experiments are repeated.

Two key reasons are 'p-hacking' and 'publication bias'. P-hacking is when researchers run a lot of slightly different statistical tests until they find a way to make findings appear statistically significant when they're actually not — a problem first discussed over 50 years ago. And because journals are more likely to publish positive than negative results, you might be reading about the one time an experiment worked, while the 10 times was run and got a 'null result' never saw the light of day. The resulting phenomenon of publication bias is one we've understood for 60 years.

Today's repeat guest, social scientist and entrepreneur Spencer Greenberg, has followed these issues closely for years.

Links to learn more, summary and full transcript.

He recently checked whether p-values, an indicator of how likely a result was to occur by pure chance, could tell us how likely an outcome would be to recur if an experiment were repeated. From his sample of 325 replications of psychology studies, the answer seemed to be yes. According to Spencer, "when the original study's p-value was less than 0.01 about 72% replicated — not bad. On the other hand, when the p-value is greater than 0.01, only about 48% replicated. A pretty big difference."

To do his bit to help get these numbers up, Spencer has launched an effort to repeat almost every social science experiment published in the journals Nature and Science, and see if they find the same results.

But while progress is being made on some fronts, Spencer thinks there are other serious problems with published research that aren't yet fully appreciated. One of these Spencer calls 'importance hacking': passing off obvious or unimportant results as surprising and meaningful.

Spencer suspects that importance hacking of this kind causes a similar amount of damage to the issues mentioned above, like p-hacking and publication bias, but is much less discussed. His replication project tries to identify importance hacking by comparing how a paper’s findings are described in the abstract to what the experiment actually showed. But the cat-and-mouse game between academics and journal reviewers is fierce, and it's far from easy to stop people exaggerating the importance of their work.

In this wide-ranging conversation, Rob and Spencer discuss the above as well as:
• When you should and shouldn't use intuition to make decisions.
• How to properly model why some people succeed more than others.
• The difference between “Soldier Altruists” and “Scout Altruists.”
• A paper that tested dozens of methods for forming the habit of going to the gym, why Spencer thinks it was presented in a very misleading way, and what it really found.
• Whether a 15-minute intervention could make people more likely to sustain a new habit two months later.
• The most common way for groups with good intentions to turn bad and cause harm.
• And Spencer's approach to a fulfilling life and doing good, which he calls “Valuism.”

Here are two flashcard decks that might make it easier to fully integrate the most important ideas they talk about:
• The first covers 18 core concepts from the episode
• The second includes 16 definitions of unusual terms.

Chapters:

  • Rob’s intro (00:00:00)
  • The interview begins (00:02:16)
  • Social science reform (00:08:46)
  • Importance hacking (00:18:23)
  • How often papers replicate with different p-values (00:43:31)
  • The Transparent Replications project (00:48:17)
  • How do we predict high levels of success? (00:55:26)
  • Soldier Altruists vs. Scout Altruists (01:08:18)
  • The Clearer Thinking podcast (01:16:27)
  • Creating habits more reliably (01:18:16)
  • Behaviour change is incredibly hard (01:32:27)
  • The FIRE Framework (01:46:21)
  • How ideology eats itself (01:54:56)
  • Valuism (02:08:31)
  • “I dropped the whip” (02:35:06)
  • Rob’s outro (02:36:40)

Producer: Keiran Harris
Audio mastering: Ben Cordell and Milo McGuire
Transcriptions: Katy Moore

Episoder(333)

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

'95% of AI Pilots Fail': The hidden agenda behind the viral stat that misled millions

You might have heard that '95% of corporate AI pilots' are failing. It was one of the most widely cited AI statistics of 2025, parroted by media outlets everywhere. It helped trigger a Nasdaq selloff ...

28 Apr 10min

#242 – Will MacAskill on how we survive the 'intelligence explosion,' AI character, and the case for 'viatopia'

#242 – Will MacAskill on how we survive the 'intelligence explosion,' AI character, and the case for 'viatopia'

Hundreds of millions already turn to AI on the most personal of topics — therapy, political opinions, and how to treat others. And as AI takes over more of the economy, the character of these systems ...

22 Apr 3h 9min

Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)

Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)

Hundreds of prominent AI scientists and other notable figures signed a statement in 2023 saying that mitigating the risk of extinction from AI should be a global priority. At 80,000 Hours, we’ve consi...

16 Apr 1h 29min

How scary is Claude Mythos? 303 pages in 21 minutes

How scary is Claude Mythos? 303 pages in 21 minutes

With Claude Mythos we have an AI that knows when it's being tested, can obscure its reasoning when it wants, and is better at breaking into (and out of) computers than any human alive. Rob Wiblin work...

10 Apr 21min

Village gossip, pesticide bans, and gene drives: 17 experts on the future of global health

Village gossip, pesticide bans, and gene drives: 17 experts on the future of global health

What does it really take to lift millions out of poverty and prevent needless deaths?In this special compilation episode, 17 past guests — including economists, nonprofit founders, and policy advisors...

7 Apr 4h 6min

What everyone is missing about Anthropic vs the Pentagon. And: The Meta leaks are worse than you think.

What everyone is missing about Anthropic vs the Pentagon. And: The Meta leaks are worse than you think.

When the Pentagon tried to strong-arm Anthropic into dropping its ban on AI-only kill decisions and mass domestic surveillance, the company refused. Its critics went on the attack: Anthropic and its s...

3 Apr 20min

#241 – Richard Moulange on how now AI codes viable genomes from scratch and outperforms virologists at lab work — what could go wrong?

#241 – Richard Moulange on how now AI codes viable genomes from scratch and outperforms virologists at lab work — what could go wrong?

Last September, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirel...

31 Mar 3h 7min

#240 – Samuel Charap on how a Ukraine ceasefire could accidentally set Europe up for a bigger war

#240 – Samuel Charap on how a Ukraine ceasefire could accidentally set Europe up for a bigger war

Many people believe a ceasefire in Ukraine will leave Europe safer. But today's guest lays out how a deal could potentially generate insidious new risks — leaving us in a situation that's equally dang...

24 Mar 1h 12min

Populært innen Fakta

fastlegen
dine-penger-pengeradet
relasjonspodden-med-dora-thorhallsdottir-kjersti-idem
rss-strid-de-norske-borgerkrigene
foreldreradet
mikkels-paskenotter
treningspodden
rss-bisarr-historie
jakt-og-fiskepodden
rss-sunn-okonomi
sinnsyn
tomprat-med-gunnar-tjomlid
rss-kunsten-a-leve
hagespiren-podcast
rss-bak-luftfarten
ukast
fryktlos
hverdagspsyken
rss-mind-body-podden
gravid-uke-for-uke