CSE 190: Statistical Natural Language Processing
Term: Winter Qtr 2018 |
![]() |
Course Description
Natural language processing (NLP) is a field of AI which aims to equip computers with the ability to intelligently process natural (human) language. This course will explore statistical techniques for the automatic analysis of natural language data. Specific topics covered include: probabilistic language models, which define probability distributions over text sequences; text classification; sequence models; parsing sentences into syntactic representations; machine translation, and machine reading.Grading
The course is lab-based. You will complete five hands-on programming assignments, individually, not in teams. All assignments contribute equally. Class participation contributes 10% to the final grade, the rest of the grade is based on the assignments. Assignment submission instructions are provided in each of the assignment descriptions.Late Submission Policy
Each student will be granted 5 late days to use over the duration of the quarter. There are no restrictions on how the late days can be used, however, note that we will not be able to accept late submissions for the last assignment. Using late days will not affect your grade. However, submitted late after all late days have been used will receive no credit. Make sure to plan ahead.
Books
Recommended texts are:
[J&M] 3rd edition free chapters online
[M&S] is free online.
Syllabus (tentative)
Date | Topic/Readings | Assignment (Out) | |
---|---|---|---|
Jan 9 | Introduction | ||
J&M Chapter 1 Introduction | |||
Hirschberg & Manning, Science 2015 Advances in NLP | |||
Language Modelling | |||
Jan 11 | Michael Collins. Notes on Language Modelling | P1: Language Modeling (Due Jan 26) | |
J&M Chapter 4 N-grams | |||
Jan 16 | Michael Collins. Notes on Log-linear models | ||
Goldberg, JAIR 2016 A Primer on Neural Network Models for NLP. (Sections 1-4 & 10-13) | |||
Text Classification | |||
Jan 18 | J&M Chapter 6 Naive Bayes and Sentiment Classification | ||
J&M Chapter 7 Logistic Regression | |||
Michael Collins. Notes on Naive Bayes, MLE, and EM | |||
Distributional Semantics | |||
Jan 23 & 25 | Goldberg, JAIR 2016 A Primer on Neural Network Models for NLP. Sections 1-5 | P2: Text Classification (Due Feb 9) | |
Chris McCormick, 2016 Word2Vec Tutorial - The Skip-Gram Model | |||
Mikolov et al., NIPS 2013 Distributed Representations of Words and Phrases and their Compositionality | |||
Mikolov et al., 2013 Efficient Estimation of Word Representations in Vector Space | |||
Tagging Problems & Hidden Markov Models | |||
Jan 30 | J&M Chapter 9 Hidden Markov Models | ||
Michael Collins. Notes on Tagging with Hidden Markov Models | |||
J&M Chapter 10 Part-of-Speech Tagging | |||
Parsing and Context Free Grammars | |||
Feb 1 & 6 | Michael Collins. Notes on Probabilistic Context-Free Grammars | ||
(Optional) J&M Chapter 12 Syntactic Parsing | |||
(Optional) J&M Chapter 13 Statistical Parsing | |||
Feb 8 | Michael Collins. Notes on Lexicalized Probabilistic Context-Free Grammars | P3: Sequence Tagging (Due Feb 23rd) | |
Machine Translation | |||
Feb 13 | --- | ||
Feb 15 | Michael Collins. Notes on Statistical Machine Translation | ||
Feb 20 & 22 | Michael Collins. Notes on Phrase-Based Translation Models | P4: Syntax Parsing (Due Mar 6th) | |
Feb 27 | Graham Neubig. Tutorial on Neural Machine Translation | ||
Machine Reading | |||
Mar 1 | Carlson et al AAAI 2010. Toward an Architecture for Never-Ending Language Learning | P5: Machine Translation (Due Mar 16th - no late days) | |
Mitchell et al AAAI 2015. Never Ending Learning | |||
Sukhbaatar et al., NIPS 2015 End-To-End Memory Networks | |||
Mar 6 | J&M Chapter 21 Information Extraction | ||
Mar 8 | Sentence Representation (Kiros et al., NIPS 2015 Skip-Thought Vectors) | ||
Coreference Resolution | |||
Dialogue Systems and Chatbots | |||
Mar 13 | J&M Chapter 29 Dialogue Systems and Chatbots | ||
March 15 | ---- |