Computer Science 252
Assignment 1: NumPy, Matplotlib, and the Dot Product
Due Friday 16 September
I've broken down the assignment into parts to make it easier, but you'll submit everything in a single Python script
when you're done. In my own solution I put a print() statement indicating each part, and I encourage you
to do this same.
- Gain experience using NumPy and matplotlib, the Python packages that have largely
replaced Matlab in the neural network community.
- Use NumPy to implement the dot product operation that is at the heart of most neural-net algorithms.
- Understand why we use NumPy to do this, instead of using the explicit for loops you learned in CSCI 111.
If you're new to NumPy and matplotlib, remember: GIYF.
There's likely a good example on stackoverflow.com of how to do what you want to do, such as overlaying
multiple plots. I will typically put a comment in my code with a link to the page that helped me in cases
like this, and I encourage you to do the same. If you have any concerns about Honor System issues relating
to this kind of coding, feel free to ask me.
Part 1: Roll your own
To get started, create a Python script called dotprod.py. This script should perform the following
action. Each action should be preceded by a print statement to let the user know what's going on.
- Create two lists of 1000000 (one million) elements, made by calling random.random()
in a loop and appending the result to the list. Since you're doing this more than once, you should
write a function for it.
- Write a function to compute the dot product of the two lists. Print out the result.
- Does the result make sense? Add a print statement saying what you expected the result
to be, and why.
- Use time.time() to compute the time taken by your dot-product function. (Call it once
immediately before, storing the current time in a variable. Then call it again immediately after,
subtracting the stored time value from the new current time).
Part 2: My name is NumPy, pronounced with an umPy ...
Now we're going to repeat what we did in Part 1, but this time with NumPy. So in your dotprod.py script
add some code to do the following:
- Convert each of your two lists into a NumPy array, using numpy.asarray()
- Use numpy.dot() to compute the dot product of these two arrays. Report the
result (which should be the same as in Part 1, to several decimal places), and the time taken as you did above
(which should be much shorter). The convention in the NumPy
community is to abbreviate this by doing import numpy as as np at the top of your script,
then calling np.dot() later.
Part 3: Speed trials and matplotlib
This part will be the most complicated, but will make use of things you already did in the first two parts.
You're going to compare the time taken by the two approaches (roll-your-own loop against numpy) for different
sizes of very large
- Start with three empty numpy arrays. One array will hold time values for your roll-your-own dot product,
another will hold time values for np.dot(), and a third will hold the size of the dot product
being computed (see next step).
- Create a for loop to step from some large number of values to some even larger number, in large
increments. For my solution I started at 1000000 (one million) values and stepped through 10000000 (ten
million), inclusive, in increments of 1000000 (one million).
- For each such number, repeat steps 1 and 2 above; i.e., time your own version of dot product and then
np.dot(). But now, append each timing result to the appropriate array (initially empty), using
Google numpy append for the syntax, which is different from that of standard Python arrays.
- Once you've got your two timing arrays, use matplotlib.pyplot.plot() to create a plot of each,
followed by matplotlib.pyplot.show() to display your plot. The convention in the matplotlib
community is to abbreviate this by doing import matplotlib.pyplot as plt at the top of your script,
then calling plt.plot(), plt.show(), etc.
- Annotate your plot with axis labels and a legend, using plt.xlabel(), plt.ylabel(),
and plt.legend(). You should get something like the figure below.
What to turn in to Sakai
For this assignment, submit just your dotprod.py file. I am going to test this program as follows,
from the command-line in a terminal window (I'm using % to
indicate the command-line prompt):
% python3 dotprod.py
If your program contains a syntax error or runtime error, you will get a zero on this
So, if you're smart, you'll set aside an extra few minutes to download your sakai submissions as test them as above.