Flesch Kincaid Readability Tests
Data Skeptic19 Apr 2021

Flesch Kincaid Readability Tests

Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?

While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.

In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.

For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(601)

Shadow Profiles on Social Networks

Shadow Profiles on Social Networks

Emre Sarigol joins me this week to discuss his paper Online Privacy as a Collective Phenomenon. This paper studies data collected from social networks and how the sharing behaviors of individuals can ...

13 Feb 201538min

[MINI] The Chi-Squared Test

[MINI] The Chi-Squared Test

The Chi-Squared test is a methodology for hypothesis testing. When one has categorical data, in the form of frequency counts or observations (e.g. Vegetarian, Pescetarian, and Omnivore), split into tw...

6 Feb 201517min

Mapping Reddit Topics with Randy Olson

Mapping Reddit Topics with Randy Olson

My quest this week is noteworthy a.i. researcher Randy Olson who joins me to share his work creating the Reddit World Map - a visualization that illuminates clusters in the reddit community based on u...

30 Jan 201529min

[MINI] Partially Observable State Spaces

[MINI] Partially Observable State Spaces

When dealing with dynamic systems that are potentially undergoing constant change, its helpful to describe what "state" they are in.  In many applications the manner in which the state changes from on...

23 Jan 201512min

Easily Fooling Deep Neural Networks

Easily Fooling Deep Neural Networks

My guest this week is Anh Nguyen, a PhD student at the University of Wyoming working in the Evolving AI lab. The episode discusses the paper Deep Neural Networks are Easily Fooled [pdf] by Anh Nguyen,...

16 Jan 201528min

[MINI] Data Provenance

[MINI] Data Provenance

This episode introduces a high level discussion on the topic of Data Provenance, with more MINI episodes to follow to get into specific topics. Thanks to listener Sara L who wrote in to point out the ...

9 Jan 201510min

Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill

Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill

I had the change to speak with well known Sharon Hill (@idoubtit) for the first episode of 2015. We discuss a number of interesting topics including the contributions Doubtful News makes to getting sc...

3 Jan 201531min

[MINI] Belief in Santa

[MINI] Belief in Santa

In this quick holiday episode, we touch on how one would approach modeling the statistical distribution over the probability of belief in Santa Claus given age.

26 Des 20149min

Populært innen Vitenskap

fastlegen
tingenes-tilstand
jss
forskningno
rekommandert
rss-zahid-ali-hjelper-deg
rss-paradigmepodden
sinnsyn
vett-og-vitenskap-med-gaute-einevoll
rss-overskuddsliv
nordnorsk-historie
kvinnehelsepodden
tidlose-historier
villmarksliv
liberal-halvtime
rss-inn-til-kjernen-med-sunniva-rose
fjellsportpodden
grunnstoffene
nevropodden
rss-rekommandert