Flesch Kincaid Readability Tests
Data Skeptic19 Huhti 2021

Flesch Kincaid Readability Tests

Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?

While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.

In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.

For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

Tämä jakso on lisätty Podme-palveluun avoimen RSS-syötteen kautta eikä se ole Podmen omaa tuotantoa. Siksi jakso saattaa sisältää mainontaa.

Jaksot(601)

Streetlight Outage and Crime Rate Analysis with Zach Seeskin

Streetlight Outage and Crime Rate Analysis with Zach Seeskin

This episode features a discussion with statistics PhD student Zach Seeskin about a project he was involved in as part of the Eric and Wendy Schmidt Data Science for Social Good Summer Fellowship.  Th...

18 Heinä 201433min

[MINI] Experimental Design

[MINI] Experimental Design

This episode loosely explores the topic of Experimental Design including hypothesis testing, the importance of statistical tests, and an everyday and business example.

11 Heinä 201415min

The Right (big data) Tool for the Job with Jay Shankar

The Right (big data) Tool for the Job with Jay Shankar

In this week's episode, we discuss applied solutions to big data problem with big data engineer Jay Shankar.  The episode explores approaches and design philosophy to solving real world big data busin...

7 Heinä 201449min

[MINI] Bayesian Updating

[MINI] Bayesian Updating

In this minisode, we discuss Bayesian Updating - the process by which one can calculate the most likely hypothesis might be true given one's older / prior belief and all new evidence.

27 Kesä 201411min

Personalized Medicine with Niki Athanasiadou

Personalized Medicine with Niki Athanasiadou

In the second full length episode of the podcast, we discuss the current state of personalized medicine and the advancements in genetics that have made it possible.

20 Kesä 201457min

[MINI] p-values

[MINI] p-values

In this mini, we discuss p-values and their use in hypothesis testing, in the context of an hypothetical experiment on plant flowering, and end with a reference to the Particle Fever documentary and h...

13 Kesä 201416min

Advertising Attribution with Nathan Janos

Advertising Attribution with Nathan Janos

A conversation with Convertro's Nathan Janos about methodologies used to help advertisers understand the affect each of their marketing efforts (print, SEM, display, skywriting, etc.) contributes to t...

6 Kesä 20141h 16min

[MINI] type i / type ii errors

[MINI] type i / type ii errors

In this first mini-episode of the Data Skeptic Podcast, we define and discuss type i and type ii errors (a.k.a. false positives and false negatives).

30 Touko 201411min

Suosittua kategoriassa Tiede

rss-poliisin-mieli
tiedekulma-podcast
rss-mita-tulisi-tietaa
docemilia
filocast-filosofian-perusteet
menologeja-tutkimusmatka-vaihdevuosiin
rss-duodecim-lehti
sotataidon-ytimessa
rss-tiedetta-vai-tarinaa
utelias-mieli
radio-antro
rss-bios-podcast
rss-ranskaa-raakana
rss-kasvatuspsykologiaa-kaikille
rss-luontopodi-samuel-glassar-tutkii-luonnon-ihmeita
rss-lapsuuden-rakentajat-podcast
rss-sosiopodi