# Machine Learning Tea

## Contents

## Machine learning reading group as of 2017

The current machine learning group meets Thursdays at 3pm in 228 Malone. Papers are announced on the ml-reading group mailing list

## old machine learning tea reading group

The Johns Hopkins Machine Learning Tea is an informal gathering of students and postdocs interested in machine learning and statistics. We typically meet every week while class is in session. Announcements and discussion are posted to discuss@ml.jhu.edu, and you can sign up by visiting this link http://ml.jhu.edu/signup . Snacks and, of course, tea will be served (and there's an espresso machine nearby for coffee). The tea is inspired by similar traditions at Gatsby, Toronto, MIT, and Berkeley.

**Where:** Hackerman 306

**When:** 4:30PM on Thursdays

The format is a 10-15 minute mini-talk followed by time for questions and discussion. On occasion, we may have an open forum rather than a talk. The format for the talks is very flexible, e.g. tutorials, derivations, work in progress, puzzles. We may also present papers from conferences such as ICML, AISTATS, NIPS, although there is a separate ML Reading Group (and the NLP Reading Group reads a lot of ML papers too).

## Spring 2012

- Feb 9 - Neal's Slice Sampling (Matthew Gormley)
**Note:**We will meet in 320, opposite from 306.

- Feb 16 - "Ideal" Slice Sampling (Nicholas Andrews)
- Damien, P., Wakefield, J. C. and Walker, S. G. (1999). Gibbs sampling for Bayesian nonconjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331–344.
- Kalli, Maria and Griffin, Jim E. and Walker, Stephen G. (2011) Slice sampling mixture models. Statistics and Computing
- Antonietta Mira, Luke Tierney (1997). On the use of auxiliary variables in Markov chain Monte Carlo sampling. Scandinavian Journal of Statistics

- Feb 23 - Beam Sampling for Inference in Structured Models (Jason Smith)
- Paper: http://mlg.eng.cam.ac.uk/jurgen/pubs/icml2008ihmm.pdf
- Slides: http://mlg.eng.cam.ac.uk/jurgen/pubs/crism2008ihmm.pdf

- Mar 1 - Nonparametric K-Means (Michael Paul)
- Paper: http://arxiv.org/pdf/1111.0352

- Mar 8 - Expectation Maximization (Scott Novotney)
- http://www.cs.uoi.gr/~arly/papers/SPM08.pdf

- Mar 16 - Global Optimization (Juri Ganitkevitch)
- M. A. Osborne, R. Garnett and S. J. Roberts (2009). Gaussian processes for global optimization In: 3rd International Conference on Learning and Intelligent Optimization (LION3), Trento, Italy.

- Mar 22 - Spring "break"

- Mar 29 - Locality-sensitive hashing (Travis Wolfe)

- Apr 5 - Random Forests (Xuchen Yao)

- Apr 12 - TBA (Volunteer!)

- Apr 19 - TBA (Frank Ferraro)

- Apr 26 - TBA (Volunteer!)

## Spring 2011

- May 3 - Combining structured predictions from multiple "experts" without any supervised data (Alex Klementiev)

- Apr 26 - Rare class prediction (Juri Ganitkevitch)

- Apr 19 - Hyperparameter estimation in Dirichlet process mixture models (Nicholas Andrews)

- Apr 12 - Dirichlet processes and DP mixture models (Adam Teichert)

**Weekly Tip:**Ever wanted an exceptionally high-quality, fast random number generator for Java? It claims to be about a third faster than Java.Random.

- Apr 5 - Permutation tests (Xuchen Yao)

- Mar 29 -
~~Paired permutation tests (Xuchen Yao)~~Predicting self-training performance (Scott Novotney)

- Mar 22 - Spring "break"

- Mar 15 - Dual Decomposition (Michael Paul)

**Weekly Tip:**When optimizing a non-convex function with L-BFGS, dumping the history (approximation to the inverse Hessian) whenever the algorithm thinks it's converged can help avoid local optima. This comes from a post by Aria Haghighi on the Q/A section of MetaOptimize. Thanks to Markus for finding this.

- Mar 8 - Video lecture on reinforcement learning for autonomous helicopter flight (Wes Filardo)

- Feb 22 - Decision theory for statistical inference (Joshua Vogelstein)
- Check out the webpage! Alternate link.
**Weekly Tip:**GmailTeX. Enough said.

- Feb 15 - Metric Learning (Delip Rao)

**Weekly Tip:**The prior proportions <math>\vec{\tau}</math> of a hierarchical Dirichlet process can be sampled by simulating how new components are created for <math>n_{m,k}</math> draws from the Dirichlet process with precision <math>\alpha \tau_k</math>, which is a sequence of Bernoulli trials.

- See Section 6.1.12 of "Mixture Block Methods for Non Parametric Bayesian Models with Applications" for details.

- Feb 9 - Robust parameter estimation in Bayesian networks (Ves Stoyanov)

**Weekly Tip:**The number of connected components of a graph is equal to the number of zero eigenvalues of the graph Laplacian.

- Proof: See lemma 1.7 in Spectral Graph Theory, Fan Chung