CSE 259: AI Seminar
Autumn 2019, Monday 12:00-13:00, CSE (EBU-3b) 4140
CSE 259 is a weekly seminar course that covers current topics in Machine Learning and Artificial Intelligence.
|Week 1 (September 30)|
Audra McMillan (Boston University / Northeastern University)
Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. In this talk we'll address a basic question about privately testing simple hypotheses: given two distributions P and Q, and a privacy level ε, how many i.i.d. samples are needed to distinguish P from Q subject to ε-differential privacy, and what sort of tests have optimal sample complexity? Specifically, we'll characterize this sample complexity up to constant factors in terms of the structure of P and Q and the privacy level ε, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. This result is an analogue of the classical Neyman–Pearson lemma in the setting of private hypothesis testing. The characterization applies more generally to hypothesis tests satisfying essentially any notion of algorithmic stability, which is known to imply strong generalization bounds in adaptive data analysis, and thus our results have applications even when privacy is not a primary concern.
Full paper can be found at: https://arxiv.org/abs/1811.11148. Joint work with Clément Canonne, Gautam Kamath, Adam Smith and Jonathan Ullman.
|Week 2 (October 7)|
Jingbo Shang (UCSD)
Named Entity Recognition from a Data-Driven Perspective
Named entity recognition (NER) is one of the core tasks in natural language processing (NLP), and has numerous applications in various domains. Recent advances in neural NER models (e.g., LSTM-CRF) have freed human effort from handcrafting features. In this talk, we will briefly revisit these models and discuss how can we improve upon them from a data-driven perspective. The key philosophy of "data-driven" here is to enhance NER performance without introducing any additional human annotations. We will attack this problem from different angles, including pre-training & co-training language models, introducing dictionaries for distant supervision, detecting and re-weighing noise training data, and removing the dependency on tokenizer (especially for social media posts).
|Week 3 (October 14)|
PhD students will discuss their summer internships:
- Bodhisattwa Prasad Majumder
- Mehrab Tanjim
- Wang-Cheng Kang
- Shuyang Li
Paarth Neekhara will also present his EMNLP paper: Adversarial Reprogramming of Text Classification Neural Networks
|Week 4 (October 21)|
EMNLP Presentations (Part 2)
PhD students will present their EMNLP (Empirical Methods in Natural Language Processing) papers:
- Harsh Jhamtani: Learning Rhyming Constraints using Structured Adversaries
- Bodhisattwa Prasad Majumder: Generating Personalized Recipes from Historical User Preferences
- Huanru Henry Mao: Improving Neural Story Generation by Targeted Common Sense Grounding
|Week 5 (October 28)|
EMNLP Presentations (Part 3)
PhD students will present their EMNLP papers:
- Akshay Srivatsan: A Deep Factorization of Style and Structure in Fonts
- Jianmo Ni: Justifying Recommendations using Distantly-Labeled Reviews and Fined-Grained Aspects
- Liliang Ren: Scalable and Accurate Dialogue State Tracking via Hierarchical Sequence Generation
|Week 6 (November 4)|
Michal Moshkovitz (UCSD)
Unexpected Effects of Online K-means Clustering
In this talk we focus on k-means clustering in the online setting. In
the offline setting the main parameters are number of centers, k, and
size of the dataset, n. Performance guarantees are given as a function
of these parameters. In the online setting new factors come into
place: the ordering of the dataset and whether n is known in advance
or not. One of the main results of this paper is the discovery that
these new factors have dramatic effects on the quality of the
clustering algorithms. For example, for constant k>1: (1) Ω(n) centers
are needed if the order is arbitrary, (2) if the order is random and n
is unknown in advance, the number of centers reduces to Θ(log n), and
(3) if n is known, then the number of centers reduces to a constant.
For different values of the new factors, we show upper and lower
bounds that are exactly the same up to a constant, thus achieving
|Week 7 (November 11)|
|Week 8 (November 18)|
Cyrus Rashtchian (UCSD)
Robustness for Non-Parametric Classifiers
Adversarially robust machine learning has received much recent attention. However, prior attacks and defenses for non-parametric methods have been developed in an ad-hoc or classifier-specific basis. In this work, we take a holistic look at adversarial examples for non-parametric methods, including nearest neighbors, decision trees, and random forests.
We provide a general defense method, adversarial pruning, that works by preprocessing the dataset to become well-separated. To test our defense, we provide a novel attack that applies to a wide range of non-parametric classifiers. Theoretically, we derive an optimally robust classifier, which is analogous to the Bayes Optimal classifier. Then, we show that adversarial pruning can be viewed as a finite sample approximation to this optimal classifier. Finally, we empirically show that our defense and attack are either better than or competitive with prior work on non-parametric methods. Overall, our results provide a strong and broadly-applicable baseline for future work on robust non-parametric classifiers.
Joint work with Yao-Yuan Yang, Yizhen Wang, and Kamalika Chaudhuri.
|Week 9 (November 25)|
No seminar (Jen Golbeck DLS visit)
There will be no seminar on November 25, as I am hosting the Distinguished Lecture Series (DLS) speaker, Jen Golbeck. Please attend her talk at 11:00am, in Room 1202.
|Week 10 (December 2)|
Amazon Alexa Prize Team
PhD students will discuss their progress on the Amazon Alexa Prize. Presenters include:
- Bodhisattwa Prasad Majumder
- Sophia Sun
- Jianmo Ni
- Shuyang Li
- Henry Mao