CSCI 315 Assignment #5

Due 11:59PM Wednesday 08 November, via Sakai

Goal

The goal of this assignment is to get some practice working with convolutional neural networks in TensorFlow. As in the previous assignment, we will begin by modifying existing code from the author, instead of attempting to write the whole program from scratch.

Getting Started

The author's repository that you downloaded last time does not have example code for his Chapter 5 on convolutional networks in the expected location (fdl_examples) But if you dig around in his archive directory, you'll see a few scripts, like convnet_mnist.py, that look promising. So, to get started, grab the convnet_mnist.py and copy it to a new, empty directory (the one you'll zip up and submit to sakai at the end). Then open your copy of convnet_mnist.py in IDLE3, and you're ready to begin.

Part 1: Get it working without errors or warnings

As expected from last time, hitting F5 in the convnet_mnist.py script will produce some errors and warnings. Some of these are caused by missing folders or helper scripts, but others come from the use of obsolete Python syntax (probably why this script is in the archive). So, as before, do what you have to do to fix the errors so you can get to the point where the network enters its training loop and starts reporting validation errors.

Like last time, training is going to take a ridiculously long time with the 1,000 training epochs specified at the top of the code. So, once you've successfully entered the training loop, hit CTRL-C to stop the program, and reduce the number of epochs to a very small value (like two or three) that will allow the program to get to the testing stage quickly. On my beefy new workstation (32 GB RAM, NVIDIA GeForce GTX 1080 Ti GPU), the program finished up just fine; however, when I tried it on a machine in P413, I got a diahrretic explosion of errors at the end! Scrolling up into the massive error report, I eventually figured out a slight modification to the code that would allow me to get all the way through the testing phase. (Hint: it's at the very end, after the training loop has completed). Modify the code so that it completes without error, and then modify the final line of code (test-accuracy report) to reflect what you did.

Having fixed that final problem, restore the number of test epochs to a value that gives good test accuracy. (I found that, thanks to the awesomeness of convolution nets, even 10 epochs got the final test accuracy to an impressive 99%!) Finally, as in the previous assignment, you should make sure you can get all the way through training and testing, and then make the changes necessary to get rid of the deprecation warnings (same ones as before, of course).

Part 2: Display a confusion matrix

Remember the confusion matrix you generated for Assignment #3? Go back and review / grab that code, because we're going to finish this assignment by doing the same confusion matrix for our convnet_mnist.py script.

Now, TensorFlow does offer a built-in built-in function for producing a confusion matrix. I have however been unable to find a single complete example of anyone using it successfully with an actual classifier. If you can get tf.confusion_matrix working with this assignment, that'd be worth some serious extra credit; however, after a few frustrating hours trying to get it working on my own, I gave up and returned to the approach we took in Assignment #3: start with an empty 10x10 matrix, then loop over all predictions and targets, comparing them one-by-one to accumulate values in the matrix.

So, obviously, you're going to have to add some code at the end to generate predictions and targets. Here, again, I got very frustrated, because unlike ordinary Python, TensorFlow does not allow you to generate results like this directly. Instead, every time you want a concrete result, you have to invoke sess.run(). At some point I realized the existing evaluate and eval_op functions could be copy-paste-modified to use in a final sess.run() invocation to give me the predictions and targets. (When I finally figured it out, I was embarrassed at how little additional code was actually required.) Again, you'll want to keep that new TensorFlow code separate from the subsequent code that generates the actual confusion matrix, or you'll end up frustrated and confused! As a final note, observe that the targets are not in numerical order; for example, the first ten targets are [7, 2, 1, 0, 4, 1, 4, 9, 5, 9].

What to turn in to sakai

As before, zip up everything I'll need (convnet_mnist.py and other material) to test your solution. As always, the right way to make sure your submission is working is to download, unzip, and run it in a fresh location! Although I encourage you to run TensorBoard to visualize the training history and dataflow graph, there's no need to turn in the visualization or a PDF report or anything else. To test your code, I'm just going to open your convnet_mnist.py, hit F5, and wait to see the final accuracy and the confusion matrix.