Computing Toolbox
Data Skeptic19 Aug 2024

Computing Toolbox

This season it's become clear that computing skills are vital for working in the natural sciences. In this episode, we were fortunate to speak with Madlen Wilmes, co-author of the book "Computing Skills for Biologists: A Toolbox". We discussed the book and why it's a great resource for students and teachers. In addition to the book, Madlen shared her experience and advice on transitioning from academia to an industry career and how data analytic skills transfer to jobs that your professionals might not always consider. Join us and learn more about the book and careers using transferable skills.

Avsnitt(588)

[MINI] Receiver Operating Characteristic (ROC) Curve

[MINI] Receiver Operating Characteristic (ROC) Curve

An ROC curve is a plot that compares the trade off of true positives and false positives of a binary classifier under different thresholds. The area under the curve (AUC) is useful in determining how discriminating a model is. Together, ROC and AUC are very useful diagnostics for understanding the power of one's model and how to tune it.

15 Juli 201611min

Multiple Comparisons and Conversion Optimization

Multiple Comparisons and Conversion Optimization

I'm joined by Chris Stucchio this week to discuss how deliberate or uninformed statistical practitioners can derive spurious and arbitrary results via multiple comparisons. We discuss p-hacking and a variety of other important lessons and tips for proper analysis. You can enjoy Chris's writing on his blog at chrisstucchio.com and you may also like his recent talk Multiple Comparisons: Make Your Boss Happy with False Positives, Guarenteed.

8 Juli 201630min

[MINI] Leakage

[MINI] Leakage

If you'd like to make a good prediction, your best bet is to invent a time machine, visit the future, observe the value, and return to the past. For those without access to time travel technology, we need to avoid including information about the future in our training data when building machine learning models. Similarly, if any other feature whose value would not actually be available in practice at the time you'd want to use the model to make a prediction, is a feature that can introduce leakage to your model.

1 Juli 201612min

Predictive Policing

Predictive Policing

Kristian Lum (@KLdivergence) joins me this week to discuss her work at @hrdag on predictive policing. We also discuss Multiple Systems Estimation, a technique for inferring statistical information about a population from separate sources of observation. If you enjoy this discussion, check out the panel Tyranny of the Algorithm? Predictive Analytics & Human Rights which was mentioned in the episode.

24 Juni 201636min

[MINI] The CAP Theorem

[MINI] The CAP Theorem

Distributed computing cannot guarantee consistency, accuracy, and partition tolerance. Most system architects need to think carefully about how they should appropriately balance the needs of their application across these competing objectives. Linh Da and Kyle discuss the CAP Theorem using the analogy of a phone tree for alerting people about a school snow day.

17 Juni 201610min

Detecting Terrorists with Facial Recognition?

Detecting Terrorists with Facial Recognition?

A startup is claiming that they can detect terrorists purely through facial recognition. In this solo episode, Kyle explores the plausibility of these claims.

10 Juni 201633min

[MINI] Goodhart's Law

[MINI] Goodhart's Law

Goodhart's law states that "When a measure becomes a target, it ceases to be a good measure". In this mini-episode we discuss how this affects SEO, call centers, and Scrum.

3 Juni 201610min

Data Science at eHarmony

Data Science at eHarmony

I'm joined this week by Jon Morra, director of data science at eHarmony to discuss a variety of ways in which machine learning and data science are being applied to help connect people for successful long term relationships. Interesting open source projects mentioned in the interview include Face-parts, a web service for detecting faces and extracting a robust set of fiducial markers (features) from the image, and Aloha, a Scala based machine learning library. You can learn more about these and other interesting projects at the eHarmony github page. In the wrap up, Jon mentioned the LA Machine Learning meetup which he runs. This is a great resource for LA residents separate and complementary to datascience.la groups, so consider signing up for all of the above and I hope to see you there in the future.

27 Maj 201642min

Populärt inom Vetenskap

p3-dystopia
paranormalt-med-caroline-giertz
dumma-manniskor
allt-du-velat-veta
rss-vetenskapligt-talat
svd-nyhetsartiklar
kapitalet-en-podd-om-ekonomi
rss-vetenskapspodden
dumforklarat
rss-vetenskapsradion-2
sexet
rss-vetenskapsradion
medicinvetarna
rss-ufobortom-rimligt-tvivel
rss-i-hjarnan-pa-louise-epstein
det-morka-psyket
bildningspodden
halsorevolutionen
rss-spraket
vetenskapsradion