# Computer Science 121 Final Project, Part II

## Learning from Examples

In this part of the lab, we will try three different learning algorithms on Murphy's sprinkler network, and then we will modify our learning code to work with the alarm example.

To begin, read the sections on Maximum likelihood parameter estimation from complete data and Maximum likelihood parameter estimation with missing values, and (as in the warmup exercise), try and get the sample code to run in Matlab.

The file learn_sprinkler.m contains code from the BNT tutorial, implementing these two types of learning, as well as the more complicated K2 algorithm for Structure Learning (when you need to learn not only the conditional probability tables but also the internal structure of the network). Experiment with different sample sizes to see how many samples it takes before all three learning algorithms work reasonably well, where "reasonably well" means that it learns the structure (Learning Structure reports a value of 1 at the last iteration), and at least some of the learned proabilities are close to the originals. (As usual, an efficient way to do this is to start with a small number, like 5, and then double it until you get good results.) Put the answer in a writeup file (PDF) to be turned it with the lab. Your answer should say how many iterations it took to learn the structure, as well as a providing a comparison of a few sample learned-versus-origional probablities.

Next, combine the relevant code from your alarm.m solution with the code from learn_sprinkler.m, to make a program learn_alarm.m that learns the alarm network. Now how many samples does it take to learn the network? If the answer is "too many" (program takes too long), then what specifically is different about the alarm example that makes it so difficult to learn? Can you modify the alarm example slightly to make it more learnable? Put the answers to these questions in your writeup.

So, at the end of this lab, you should turn in learn_alarm.m and your PDF writeup.

## Extra Credit: Dynamic Bayes Nets

### Backstory

Practical joker Melvin Soy has stolen one of VMI's Howitzer guns and is driving recklessly around southwestern Viriginia with it in tow, threatening to fire off the Howitzer and wake people up at night. In order to avoid detection, Melvin steers a quasi-random course among the cities of Lexington, Staunton, and Covington. His goal is to fire the Howitzer before he can be stopped. He can only do this from Staunton, where no one recognizes him. In order to avoid detection, he never stays in the same town for more than one time step. If he is in Lexington or Covington, he goes to the nearer of the other two cities 80% of the time. Since Staunton is about the same distance from the other two cities, he goes from there to either of them with equal probability.

Firing a Howitzer requires moving it from its Down (resting) state, to its Pre-firing state, and then to its Firing state.1 Each of these moves takes a single time step. After firing, the Howitzer always moves to the Down state. If Melvin is in Staunton and his Howitzer was in the Pre-fire state on the previous step, he will always fire the Howitzer. Otherwise, the Howitzer will switch between its Down and Pre-fire states with 50/50 probability.

The main goal of this part of the assignment is for you to understand how to translate a verbal description of a process like this one into a DBN DAG and conditional probability tables (CPTs). So at a minimum you should turn in a picture of the DAG (generated as you did in last week's lab), as well as the CPTs. You will need four variables: Location (town) and State at the present time step and the next time step. Your DAG should reflect the fact that the current location influences the next location, the current Howitzer state influence the next state, and the next location influences the next state. There are no other dependencies in the model.

### Video Game

The code in the this zipfile contains a little video game that will allow you to test your DBN ''in the field''. In this game, you control a Predator Unmanned Aerial Vehicle (UAV) through human-issued commands or through a DBN-driven software agent. The goal is to prevent the firing of the Howitzer, by using the Predator to destroy the Howitzer before it has a chance to fire. If you destroy the Howitzer when it is in its firing state, you get 10,000 points. If you destroy it when it is in its pre-firing state, you get only 5,000 points. If you destroy it when it is in its down state, you get 1,000 points. If the game ends before the Howitzer is fired, you get zero points.

To try out the game, type play(@human) in Matlab to run the game using human guidance. Then take a look at the files in the @agent directory. These files contain a partially-implemented agent that uses a DBN to predict the Howitzer's next location and state and makes a decision based on the prediction. In the file agent.m, you should add some code based on the DAG and CPT's that you came up with in the previous section. In the file getcmd.m, you should add code to run inference on your DBN. Happy hunting! The XXX comments show you where to add your code. You should not modify any of the existing code.

1Actually, I just made this part up. But the exercise is based on a real project I worked on, where we were trying to use Dynamic Bayes Nets to track SCUD missile launchers.