Hello, Friend.

Name

Matus Telgarsky

Email

mtelgars at cs dot ucsd dot edu

Locations

2014 - (?). Postdoc in EECS, University of Michigan.
Host: Jake Abernethy.

2013 - 2014. Consultant at MSR NYC. Host: John Langford.

2013 - 2014. Postdoc in Statistics, Rutgers University. Host: Tong Zhang.

2007 - 2013. PhD in Computer Science, UCSD. Advisor: Sanjoy Dasgupta.

2004 - 2007. BS in Computer Science & Discrete Math, CMU.

2001 - 2003. Diploma in Violin Performance, Juilliard.

2013 - 2014. Consultant at MSR NYC. Host: John Langford.

2013 - 2014. Postdoc in Statistics, Rutgers University. Host: Tong Zhang.

2007 - 2013. PhD in Computer Science, UCSD. Advisor: Sanjoy Dasgupta.

2004 - 2007. BS in Computer Science & Discrete Math, CMU.

2001 - 2003. Diploma in Violin Performance, Juilliard.

Postprints

Convex Risk Minimization and Conditional Probability Estimation.
(With Miroslav DudÃk and Robert Schapire.)
[arXiv]
[short video]

- COLT 2015.
- Even when the parameter space is ill-behaved (infinite dimensional, minima
don't exist, not bounded or regularized), risk minimization of certain standard
losses still converges to a unique object;
in the finite dimensional case, uniform convergence holds for
*empirical*risk minimization.

- NIPS 2014.
- SGD variant which greedily grows monomial features of increasing degree.

- JMLR 15:2773-2832, 2014.
- Analysis of tensor power iteration for tensors which have an orthogonal decomposition (plus some slight noise), and examples of latent variable models which can be written in this way (e.g., LDA).

- NIPS 2013.
- Generalization bounds for \(k\)-means cost and Gaussian mixture log-likelihood for unbounded parameter sets and distributions with a few bounded moments (but with no further boundedness assumptions).

- COLT 2013.
- Optimization, generalization, and consistency guarantees for AdaBoost with logistic and similar losses.

- ICML 2013.
- AdaBoost, with a variety of losses, attains optimal margins merely by multiplying the step size with a small constant.

- ICML 2012.
- Provides the natural algorithm, with attention to: handling degenerate clusters via smoothing, Bregman divergences for nondifferentiable convex functions, Exponential Families without minimality assumptions.

- JMLR 13:561-606, 2012.
- This is the extended version of the NIPS paper "The Fast Convergence of Boosting".

- NIPS Optimization Workshop 2011.
- Adaptation of some of the boosting techniques to other optimization problems, for instance gradient descent of positive semi-definite quadratics.

- NIPS 2011.
- AdaBoost, with a variety of losses, minimizes its empirical risk at rate \(\mathcal O(\ln(1/\epsilon))\) when either weak learnable or possessing a minimizer, and rate \(\mathcal O(1/\epsilon)\) in general.

- AISTATS 2010.
- Hartigan's method minimizes \(k\)-means cost point by point; it terminates when points lie within regions defined by intersections of spheres (rather than just Voronoi cells).

- ICASSP 2007.

Notes

Representation Benefits of Deep Feedforward Networks.
[arXiv]

Dirichlet draws are sparse with high probability. [arXiv]

Dirichlet draws are sparse with high probability. [arXiv]

Preprints

Blackwell Approachability and Minimax Theory.
(2011.)
[arXiv]

Central Binomial Tail Bounds. (2009.) [arXiv]

Central Binomial Tail Bounds. (2009.) [arXiv]

Ph.D. Thesis

Duality and Data Dependence in Boosting.
(2013.)
[pdf]

- Committee: Sanjoy Dasgupta (chair/advisor), Kamalika Chaudhuri, Yoav Freund, Patrick Fitzsimmons, Philip Gill, Robert Schapire.
- Contains work from NIPS 2011, JMLR 2012, and COLT 2013 as above.
- The results of chapter 5 (whose proofs have some minor errors) appeared later in a vastly expanded form within "Convex risk minimization and conditional probability estimation" above.