Machine Learning & Data Analytics

Module 06 — Self Driving Cars, Project

Overview

After a few more meetings, your team has been assigned to address the following issues asked by the stakeholders:

Thomas, COO of HackPressIO

So the main thing we need at this point is a proof of concept model that shows we could, with enough work, generate full texts in the style and voice of a particular author.

Monika, Senior Developer

I agree. The original team used Jane Austen as their training corpus, but you could use any author's work you find at Project Gutenberg. Just be sure to clean up the data appropriately.

Thomas, COO of HackPressIO

I think that in order for me to feel good about whatever pipeline is being developed, I'd want to see works in the style of more than one author, at least two or three.

Johnny, the data science intern

Which means a separate network trained on each author's works...

It might be a good idea to define a network architecture used for all authors, and then save the trained model for each author, so you can load a given author's profile into the network whenever you wanted...but we'll leave the specifics up to you.

More Tips from Johnny

keras vs tf.keras

Don't forget the warning from the last module, about how Keras used to be a standalone library, but as of September 2019, it is part of Google's TensorFlow 2.0 library.

Keep that in mind if you're looking at any tutorial that was written prior to that date. Most of the API and functions will be the same, but your import statements will likely be different.

For more information, see this article on the change.

Starter Code

This Colab notebook contains the starter code left by the previous team.

There may be better approaches than what that notebook is doing, but it will at least get you started.

Saving Models

There are multiple ways to save a Keras model.

Johnny, the Data Science Intern, catches you after work:

Hey, I know you're probably busy, so I put a bunch of comments and explanations in the code left behind by the previous team, so make sure you read through those. Also, make sure you review the RNN tutorials from the reading assignment.


  1. COO photo by Jonas Kakaroto on Unsplash 

  2. Senior Developer photo by Mimi Thian on Unsplash 

  3. Data Science Intern photo by Fábio Lucas on Unsplash