Machine Learning & Data Analytics

Exploration 01: Netflix Movies

Netflix Photo by Mollie Sivaram on Unsplash

Overview

In this data exploration, you will:

Throughout this exploration, when you're asked to use a new function or library, we'll usually provide a link to that function's documentation, or a tutorial related to it.

Introduction

We'll be using Google Colab for most of our data exploration and modeling .

This assignment will teach you how Google Colab works and give you some practice using it. You'll also get some practice using the Python language, as well as the pandas library.

Learning Python and Friends

The assignments in this course require a fair amount of Python coding as well as the use of a few popular Python-based data science tools. We don't spend much time in this class explicitly learning Python or these tools, for a few reasons:

However, we have put together a guide containing some good references and tutorials, as well as a guide with tips for reading technical documentation.

Introduction to Google Colab

Google Colab is a cloud-based data science development environment. It includes support for the most widely used libraries and tools in data science, and provides access to optimized hardware for data analysis.

Google Colab uses "Jupyter Notebooks" for data exploration and analysis. If you've never heard of those before, Jupyter Notebooks are interactive documents that contain a mix of text and python code "cells".

Local Development vs Google Colab

While it's possible to use Jupyter notebooks locally, Google Colab removes all of the painful setup and configuration required to make this work, by providing free access to cloud-based compute instances that have all the standard data science libraries pre-installed and configured for use.

You can opt to do your development locally, but keep in mind that if you run into problems with your local configuration, you'll be largely on your own to sort that out.

The following video will walk you through how Google Colab works, some things to watch out for when using it, and how to submit your assignments.

Assignment

In this data exploration, you'll use the pandas library to answer some questions about Netflix movies and shows.

Click on the Open In Colab button below to open a Google Colab notebook with the template for this assignment. Once you've completed the assignment, don't forget to take the corresponding quiz in Canvas.

Open In Colab

Teacher's Solution

Once you have absolutely exhausted all of your best efforts in solving the data exploration problems, and you are stuck on where to go next, you can view the teacher's solution here.