CSCI 315 Assignment #4

Due 11:59PM Monday 30 October, via Sakai


The goal of this assignment is to become familiar with TensorFlow, Google's state-of-the-art package for Deep Learning. We will also step-up our game, moving from Mozer's simplified 14x14-pixel MNIST digit set to the full-sized 28x28 set. Scaling up to real-world tools and datasets also means that you will have to switch from working on your laptop to working on a computer with a GPU and lots of memory. The computers in the Advanced Lab (Parmly 413) have been set up to provide these capabilities.

The good news is, the code is mostly already written for you and accessible online from the author's github repository. The bad news is, TensorFlow is evolving so rapidly that the code may already be out of date or undocumented by the time you're ready to start the assignment. Hence, a large part of these TensorFlow assignments will be learning the very practical skill of getting almost-working code to work.

Getting Started

To get started, clone the entire code repository for the book. Although there are several ways of doing this, such as downloading a zipfile, the most common, reliable way is to open a terminal (Linux or Mac OS X) or PowerShell window (Windows) and issue (copy/paste) the following command:
  git clone
Now you've got the whole repository in a convenient folder called Fundamentals-of-Deep-Learning-Book. Inside that folder you'll see fdl_examples/chapter3.

Part 1: Logistic Regression in TensorFlow

Using IDLE3, open fdl_examples/chapter3/ Hitting F5 to run the code, you'll immediately run into some errors having to do with the directory layout. So your first task will be to copy/paste/modify the directories to get this script to work. Hint: Locate the data and datatools folders, move them into the same folder as the scripts, and modify the script accordingly. Once you've got the script working, you should see it complete with a success rate of around 0.92. You may also see some red deprecation warnings at the start of the run. If so, google the warnings and fix them for full credit on this part.

Before moving on from logistic regression, change the number of epochs (iterations) at the top from 60 to 100, and make a note of your result, which you'll need in the final part below.

Part 2: Back-propagation in TensorFlow

Now let's try the other script, As with Part 1, there will be an import at the top that you'll have to modify slightly, to work with the directory structure you've created, plus the same small modification to get rid of the warning.

Depending on your hardware setup, this multilayer perceptron model can take a looooooong time to complete its 1,000 training epochs (iterations) – and the results aren't even that good! So, after noting your final test success for 1,000 epochs, reduce the number of epochs to 100, so you can get your result quicker.

Part 3: Visualization with TensorBoard

One of the cool features of TensorFlow that people rave about is TensorBoard, its browser-based visualization tool. (If you have 25 spare minutes, this video is a great introduction.)

Look over the textbook page to see how you can get tensorboard to visualize your logistic-regression or multilayer-perceptron results in a web browser (either one is okay). Once you've got the TensorBoard display in your browser, try clicking the different tabs to see the kinds of visualizations it provides. As evidence of your success, create two screenshots, one using the SCALARS tab to show a plot of the cost and validation error, and another using the GRAPHS tab to show the graph of your neural network. (If you don't know how to do a screenshot on your computer, google it, and if you still have trouble, ask me!) Later you will include these screenshots in a brief write.

Part 4: Creating a different network

To get a better feel for building neural nets in TensorFlow, let's modify the multilayer perceptron network and see how it does on the same data set.

If you haven't already changed the number of epochs in to 100, do that now. Then copy it to a new script Run that script once just to make sure it's doing what you expect. Then modify it to use only hidden layer 1 (the one with 256 units). When you're done with these small modifications, there should be no more code referring to hidden layer 2. Then run your modified network a few times to see how it does.

Part 5: Brief writeup

Using MS Word or your favorite editor, put together a little writeup summarizing your results and save it as a PDF assignment4.pdf. Your writeup should include your two TensorBoard images from Part 3, plus a brief discussion comparing the final testing success for the four training runs you performed: For the two multilayer perceptron networks, I recommend running them each five times, to get a sense of their average behavior. (The 1000-epoch test just takes too long, and for logistic regression the 92% success rate seems pretty consistent.) Specifically:
  1. Did running many more (1,000 vs 100) epochs yield better or worse results for the original multilayer perceptron?
  2. Did the multilayer pereceptron do better or worse than logistic regression when you ran them both for 100 epochs?
  3. Did decreasing the number of hidden layers reduce the success of the multilayer perceptron?
  4. What general lesson might you deduce from your answers to these three questions?

What to submit to sakai

Zip up everything but into a file, which I can simply download, unzip, and use to find and run your working versions of,, and, with no errors or warnings. What you need to include will depend on how you solve the problem: so be sure to download your own submission, unzip it, and run everything a final time for certainty! Include your little writeup with the TensorBoard visualizations in the zipfile as well. Note that in addition to looking at your writeup I will test all three of your scripts, so, as usual, if you don't test your submission by downloading, unzipping, and running it yourself, you risk getting a zero on this assignment for a silly mistake!.