** Syllabus **

Concepts of probability: discrete and continuous sample spaces, random variables, expectation, variance, conditioning, independence.

Randomized algorithms: median finding, minimum cut, hashing, load balancing, sorting, random walks on graphs.

Stochastic processes: random walks on the line and on a graph; central limit theorem; large deviation bounds; random graph models. Applications in web search, social networks, and the simulation of physical lattices.

Statistical hypothesis testing. Applications in clinical trials and DNA fingerprinting.

Machine learning: simple distribution models (n-grams, Gaussians), clustering, classification (linear separators, decision trees). Applications to statistical language models, question answering systems, speech recognition, gene microarray analysis, spam filtering, and character recognition.

** Course materials **

1. The required text for this course is:

Grinstead and Snell. *Introduction to Probability*.

A PDF of it can be downloaded free from the web (here), or you can obtain a hard copy from the American Mathematical Society bookshop.

2. We will cover some topics that aren't in the Grinstead and Snell book. I will post lecture summaries for this additional material; and recommended texts are on reserve at S&E library:

Larsen and Marx. *Introduction to Mathematical Statistics and its Applications*.

Mitzenmacher and Upfal. *Probability and Computing*.

Duda, Hart, and Stork. *Pattern Classification*.

Freedman, Pisani, and Purves. *Statistics*.

** Discussion section **

Mon 1.00-1.50 in WLH 2113

This discussion will reinforce material from lecture as well as background material from earlier courses (like CSE 21). Attendance is highly encouraged.

** Examinations **

Midterm 1: Oct 27, in class

Midterm 2: Nov 17, in class

Final: Dec 10, 3-6PM, in class

** Grading **

Quiz every Wednesday: 20%

Midterms: 20% each

Final: 40%

The lowest quiz will be dropped