Learning Machine Learning... with Kaggle

Written by William Dang

Published 2 years ago

Edited 4 months ago

Somehow, in my one and a half years as a student studying a Bachelor of Data Science and Decisions, I’ve managed to learn close to nothing about what Machine Learning actually is (unless you count the one or two DataSoc workshops I’ve attended). So, because I’ve had a term off uni (and am in desperate need to upskill myself), I decided to check out Kaggle’s Introduction to Machine Learning course, and see what it’s all about.

Kaggle

Most of you have probably heard of Kaggle by now - but if not, it’s a free website where you can find datasets, and use them to do data science projects, or even data science competitions. You can even create and store your own Jupyter notebooks on the site.

But, it’s probably better to start with Kaggle’s tutorial courses if you’re looking to acquire the skills you need to start out on these projects, or even upskill further. Here, Kaggle offers short courses that take 3-4 hours to complete. You even get a little certificate at the end that you can upload to LinkedIn!

Some of the various courses available

The certificate I got for finishing the Introduction to Machine Learning course

A look at one of the courses: Introduction to Machine Learning

The courses are split into two components: tutorials and exercises. Essentially, how it works is that for each chapter of the course, you read through the relevant theory section in the tutorial, then complete the corresponding exercise. This is often just a coding exercise on Jupyter notebook with all the starter code done for you.

Excerpt from the tutorial about Decision Trees

Excerpt from the tutorial about Random Forests

The coding exercises are where I spent the bulk of my time. In these coding exercises, they very much hold your hand, even offering you hints or the solution if you’re feeling stuck.

The practical coding exercises

A lot of the questions involve 1, 2 or at max 3 lines of code (perhaps that’s just the nature of Python). And a lot of the answers involve just changing variable names from the tutorial code - not super interesting to be honest.

Overall thoughts

For a 3 hour crash course in machine learning, I was expecting to learn… not really all that much to be honest. There’s so much theory behind machine learning that it’d be pretty difficult to squish it all in. It definitely at times feels like the course skims over all of the theory behind machine learning. Despite this (and the fact that the course doesn’t really go into that much depth theory-wise) the main emphasis of these courses is on actually coding, in this case, using sklearn to build decision tree/random forest models. And I think this is probably for the best, as you wouldn’t be using the theory too much in your job/project anyways.

So overall, doing the courses aren’t the worst use of 3 hours of your time. You won’t learn as much as you would in say a university course, but you do pick up a fair amount, and a certificate on LinkedIn to boot.