CSE 291 is open to M.S. and Ph.D. students in computer science, bioinformatics, cognitive science, and related fields. The course is complementary to other UCSD courses such as Cognitive Science 260, Math 283 (Statistical Methods in Bioinformatics), and ECE 285 (also entitled Statistical Learning). Students are welcome to take any or all of these courses. Unlike CSE 254, which will be offered in Spring 2005, CSE 291 is a lecture course.
The prerequisite for CSE 291 is an upper-division undergraduate
course on probability and statistics, such as Math 183 or 186 at UCSD,
or any graduate course on statistics, pattern recognition, or machine
learning. Students should take CSE 291 for four units, for a
letter grade. Use section id 518456 to register. (Note that
you can register even if Studentlink indicates that the section is
full.)
| date |
topics |
LaTeX notes |
| January 4 |
Reasoning (probability theory)
vs. learning (statistics), estimator vs. estimate, point estimation |
here |
| January 6 |
Unbiasedness, mean squared error
(MSE), minimum variance unbiased estimator (MVUE), suggested books |
here |
| January 11 |
Intuitive concept of
sufficiency, definition of (minimal) sufficient partition, of (minimal)
sufficient statistic, Bernoulli example |
here |
| January 13 |
Rao-Blackwell theorem intuition,
nested expectations lemma, Jensen's inequality, start of Rao-Blackwell
proof |
here |
| January 18 |
Proof of three parts of
Rao-Blackwell theorem, uniqueness of MVUEs, algorithm to obtain MVUEs |
here |
| January 20 |
Definition of completeness,
binomial example, Lehmann-Scheffe theorem, factorization theorem |
here |
| January 25 |
Comments on answers to the first
assignment--how to make reports and experiments compelling |
here |
| January 27 |
Statement of the exponential
family completeness theorem. Principle of maximum likelihood
(ML), the
score function |
here |
| February 1 |
Expectation of the score
function, Cramer-Rao lower bound (CRLB) and when it is achieved, example |
here |
| February 3 |
Example of achieving CRLB,
informal hypothesis-testing, large-sample ML, consistency and efficiency |
here |
| February 8 |
Weak law of large numbers,
central limit theorem, Taylor expansion of score function |
here |
| February 10 |
Convergence in probability, convergence in distribution, proof of ML asymptotic efficiency. Logic of hypothesis testing | here |
| February 15 |
Power function, size and
significance level, likelihood ratio tests (LRTs), t-test example |
here |
| February 17 |
Chi-squared asymptotic
distribution of LRT statistics, Pearson's chi-squared goodness-of-fit
test |
|
| February 22 |
Feedback on Assignment 2,
Pearson's statistic as an approximation of the LRT statistic,
chi-squared tests for contingency tables |
|
| February 24 |
Linear regression: least
squares, matrix solution, variance of parameter estimates, F test |
|
| March 1 |
Meaning of F statistics,
stepwise selection. Multiple comparisons, Sidak, Bonferroni,
Westfall-Young |
|
| March 3 |
MSE = bias2 +
variance, shrinkage and regularization ideas, ridge regression |
Each assignment will involve mathematical reasoning and also
programming
in Matlab.
Students are encouraged to form study groups, to collaborate on solving
the problems posed, and to
use multiple books and outside resources. However, each student
must write
up his or her solutions independently. Your solutions should be
written in good, concise English with all
necessary diagrams, plots, and explanations. You must use LaTeX
or
similar high-quality software for text processing. On the due
date, you should submit a stapled 8.5x11 printout in class. Your
submission must
be stapled and must
not be in any sort of binder.
The first assignment was due in class on Tuesday January 18. Although this assignment is not easy, it uses only the basic knowledge of probability and statistics that is a prerequisite for this course. The second assignment was due in class on Tuesday February 1. The third assignment was due in class on Tuesday February 15.
The fourth assignment is due in class on Tuesday March 1. You will need this hurricane data. Please ask questions using http://www.quicktopic.com/29/H/t3sgTnDZkMUqp.
The fifth assignment is due at the
time of the final exam, which has been scheduled
by the registrar for Thursday March 17, from 3pm to 6pm. Please
ask questions using http://www.quicktopic.com/29/H/NcdrNkr7SUA.
Most recently updated on March 8, 2005 by Charles Elkan, elkan@cs.ucsd.edu