Computer Science 252
Assignment 3: The Hopfield Net
Due Wednesday 13 October
The good news is, despite the apparent complexity of the formulas we discussed,
the actual algorithm for training and testing a Hopfield net is a lot simpler
than for an SOM. Together with the NumPy skills you've accumulated, this
means that the current assignment should be less challenging/frustrating than the previous one.
In fact, if I don't mention a new NumPy function, you can assume that you should use one you
already know, perhaps in a slightly different way.
- Understand the Hopfield Network by coding it from scratch in Python.
- Understand the limitations of the network through empirical testing.
As usual, you'll have a Python class (Hopfield) with some methods (__init__,
learn, test), followed by a main section that uses it (I like the
if __name__ == '__main__': idiom). So, let's get started. Again as usual, we'll focus on our
test cases before even starting to implement the class.
Part 1: Generate some training data
In your main section, write a line of code that uses NumPy to generate an array of ones and zeros.
Each row will represent an input vector. To keep things simple, make it five rows and 30 columns,
so that you have a relatively small number of patterns (5) that are not to long to print out
Part 2: Display a confusion matrix
In machine learning, a
is a table that shows how well your classification algorithm has worked on every possible input. For example,
if we're trying to classify images of digits, such a
would have ten rows and ten columns, and
would show how often the algorithm classified one digit as another. A perfect classification
would have all positive entries on the diagonal, and all zeros elsewhere.
For this assignment we're not doing classification, but we can still use the confusion matrix as a measure
of success. Specifically, we will use the already-familiar
to see how well our Hopfield net recovers each pattern. So you should now write a function show_confusion
that accepts two data arrays like the one you created in Part 1, and shows a matrix of the vector cosines of their
respective rows. (For example, the third row, fourth column will show the
vector cosine of the third vector in the first array with the fourth vector in the second array). To
avoid big ugly floating-point printout, use the formatted print skills you learned in CSCI 111 to
constrain the output to two decimal places. (If you don't remember formatted printing, GIYF!)
How to test your confusion matrix function? Well, if you give it your random data array from Part 1, it should
show something like this (think about why):
Part 2: Vector-cosine confusion matrix of an array with itself ----------------------
0.40 0.38 1.00
0.49 0.63 0.45 1.00
0.67 0.63 0.45 0.59 1.00
Since the vectors are all non-negative, the largest possible cosine value is still 1, but the cosine between
two random vectors is around 0.5 instead of 0.
Part 3: Noise it up!
To test the ability of our Hopfield net to recover degraded (noisy) patterns, we'll need a function to add
some noisy (random bit flips) to our data array. So write a function noisy_copy that accepts
an array like the one from Part 1, as well as a probability between 0, and 1, and flips each bit in
the array with that probability. (E.g., if probability is 0.5, then there's a 50 percent chance that a bit
will be flipped). If you're good with NumPy, you can do this without an explicit loop; if not, feel free
to write a loop. Either way, you'll want to use numpy.copy to avoid clobbering the values in the
original array, and you'll want to test your noisy_copy function on small arrays at first. (A good
test would be no change with a probability of 0, and fully changed with a probability of 1.)
Once you've got your noise function working, test it again by using your confusion-matrix function to
show the confusion matrix for your original data array and various noisy copies of itself. As more noise
is added, the values on the confusion-matrix diagonal should drop from 1 down to 0, with the non-diagonal
values pretty much unchanged. For your final output on this part, use a noise value of 0.25. Here's my
Part 3: Confusion matrix with 25 percent noise ------------------------------------
0.45 0.39 0.69
0.59 0.65 0.63 0.79
0.65 0.53 0.42 0.42 0.79
Part 4: Code up your Hopfield net
Now that we've got a nice little test suite, it's time to code up our Hopfield net. At the top of your
script, create a class Hopfield with three methods:
Once you've coded up your Hopfield net, train it on the 5×30 training data you created above. Then test
it by giving it a pattern from this set, to see how well it “recognizes” the pattern, using the
vector cosine as a success criterion. Then make a noisy copy of your data, and see how well it restores one of
the noisy patterns (compare recovered noisy with original clean). Your output should look something like this.
- A constructor that accepts the number of units n (which will also be the size of your input vectors),
and builds an n×n NumPy array T of zeros, which are the initial network weights.
You can use the numpy.zeros function to do this.
- A learn method that accepts an array of input patterns like the array from Part 1. This
method should loop over the rows of the array, modifying the weights T using the training
formula on slide # 7 of the
Be a Pythonista: do for a in data: to loop over the rows, rather than using range.
For the products, rather than looping over the elements of each input pattern vector, you should use the
numpy.outer function to compute the matrix of pairwise products, and then add this
matrix to T using the formula on the slide. After you're done looping over
the patterns, zero-out the elements on the diagonal of T, to account for the lack of
self-connections between units. (If you're clever, you can use numpy.eye to do this in one line!)
- A test method that accepts a single pattern (vector) and a number of iterations (default
to a small value like five), and iteratively runs line 3 of the the third slide of the lecture notes.
We're cheating here by using a for loop, instead of a while loop based on the
energy computation, but that's okay: we care more about restoring the pattern than about the energy
Part 4: Recovering small patterns with a Hopfield net -----------------------------
Recover pattern, no noise:
Input: [1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1]
Output: [1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1]
Vector cosine = 1.00
Recover pattern, 25% noise:
Input: [1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 1 1 1 0 1]
Output: [1 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 1 1 1 0 0 1 0 1 0 1 0 1 0 1]
Original: [1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1]
Vector cosine = 0.71
Note the mediocre results I got on the noisy pattern. Sometimes it was recovered perfectly (cosine = 1.00), but
often it was barely better than chance (cosine = 0.5). Sometimes it even failed to recover the non-noisy pattern!
Part 5: Improving the capacity
Personally, I was pretty disappointed with the results from Part 4. Even with a measly five patterns, the
Hopfield net didn't seem to live up to its reputation as a “cleanup memory” for noisy data.
As you may have suspected, a vector length of 30 is nice for debugging, but it's way too small to work as
an input to a Hopfield net. If you think about it, the reason is pretty clear: as the length of the vector
increases by O(N), the number of weights increases by O(N2).
So for a 30-input network, there are 900 weights, and the ratio of weights to inputs is 30:1. If we increase
the pattern size to 1000, however, there are a million weights, so the ratio of weights to inputs goes way up,
becoming 1000:1. Hence the bigger network is bringing a lot more resources to bear on representing the data,
and should be able to store more patterns and recovery them more robustly.
To see this, repeat steps 3 - 5, but with 10 patterns of length 10000. As the confusion matrix will show, the
vector cosines are about the same, but the network is much better at restoring noisy patterns.
Indeed, as the following output shows, I was always able to recover all ten patterns perfectly with 25% noise:
Part 5: Recovering big patterns ----------------------------------------------------
Confusion matrix for 1000-element vectors with 25 percent noise:
0.49 0.48 0.75
0.51 0.54 0.53 0.76
0.52 0.52 0.51 0.51 0.75
0.50 0.52 0.50 0.53 0.49 0.73
0.54 0.55 0.51 0.52 0.55 0.54 0.76
0.52 0.48 0.52 0.51 0.49 0.49 0.49 0.75
0.54 0.50 0.50 0.49 0.53 0.53 0.51 0.51 0.75
0.50 0.49 0.53 0.50 0.51 0.51 0.54 0.52 0.53 0.75
Recovering patterns with 25 percent noise:
Vector cosine on pattern 0 = 1.00
Vector cosine on pattern 1 = 1.00
Vector cosine on pattern 2 = 1.00
Vector cosine on pattern 3 = 1.00
Vector cosine on pattern 4 = 1.00
Vector cosine on pattern 5 = 1.00
Vector cosine on pattern 6 = 1.00
Vector cosine on pattern 7 = 1.00
Vector cosine on pattern 8 = 1.00
Vector cosine on pattern 9 = 1.00
If you find Hopfield nets exciting, I feel sorry for you – excuse me, I mean, if you find Hopfield nets
exciting, maybe you would like to try applying one to recovering images in the presence of noise. For example,
a 32×32-pixel image can be “flattened” into a vector of length 1024 (around the same size as
our successful network in Part 5). You can then train a Hopfield net on several such images, test on a noisy version
of an image, reshape the test result to 32×32, and display the original, noisy, and restored versions
for comparison. This can be done pretty easily with ASCII images (using a space for 0 and asterisk for 1),
or with matplotlib if you're up for it.
What to turn in to sakai
As usual, the only file you need to turn in is your final script (hopfield.py). As usual, I will test
% python3 hopfield.py
Your output should go all the way from Part 2 through the end (Part 5, or extra credit).
Use my output as a formatting guide.