Flesch Kincaid Readability Tests
Data Skeptic19 Apr 2021

Flesch Kincaid Readability Tests

Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?

While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.

In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.

For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

Denne episoden er hentet fra en åpen RSS-feed og er ikke publisert av Podme. Den kan derfor inneholde annonser.

Episoder(601)

[MINI] Natural Language Processing

[MINI] Natural Language Processing

This episode overviews some of the fundamental concepts of natural language processing including stemming, n-grams, part of speech tagging, and th bag of words approach.

17 Apr 201513min

Computer-based Personality Judgments

Computer-based Personality Judgments

Guest Youyou Wu discuses the work she and her collaborators did to measure the accuracy of computer based personality judgments. Using Facebook "like" data, they found that machine learning approaches...

10 Apr 201531min

[MINI] Markov Chain Monte Carlo

[MINI] Markov Chain Monte Carlo

This episode explores how going wine testing could teach us about using markov chain monte carlo (mcmc).

3 Apr 201515min

[MINI] Markov Chains

[MINI] Markov Chains

This episode introduces the idea of a Markov Chain. A Markov Chain has a set of states describing a particular system, and a probability of moving from one state to another along every valid connected...

20 Mar 201511min

Oceanography and Data Science

Oceanography and Data Science

Nicole Goebel joins us this week to share her experiences in oceanography studying phytoplankton and other aspects of the ocean and how data plays a role in that science.   We also discuss Thinkful ...

13 Mar 201533min

[MINI] Ordinary Least Squares Regression

[MINI] Ordinary Least Squares Regression

This episode explores Ordinary Least Squares or OLS - a method for finding a good fit which describes a given dataset.

6 Mar 201518min

NYC Speed Camera Analysis with Tim Schmeier

NYC Speed Camera Analysis with Tim Schmeier

New York State approved the use of automated speed cameras within a specific range of schools. Tim Schmeier did an analysis of publically available data related to these cameras as part of a project a...

27 Feb 201516min

[MINI] k-means clustering

[MINI] k-means clustering

The k-means clustering algorithm is an algorithm that computes a deterministic label for a given "k" number of clusters from an n-dimensional datset.  This mini-episode explores how Yoshi, our lilac c...

20 Feb 201514min

Populært innen Vitenskap

fastlegen
tingenes-tilstand
jss
forskningno
rekommandert
rss-zahid-ali-hjelper-deg
rss-paradigmepodden
sinnsyn
vett-og-vitenskap-med-gaute-einevoll
rss-overskuddsliv
nordnorsk-historie
kvinnehelsepodden
tidlose-historier
villmarksliv
liberal-halvtime
rss-inn-til-kjernen-med-sunniva-rose
fjellsportpodden
grunnstoffene
nevropodden
rss-rekommandert