NLP Reading Group
The reading group attempts to keep abreast of current trends in natural language processing research. We typically read one or two recent NLP conference papers each week, and occasionally look at material from the machine learning, statistics, and linguistics communities as well.
Starting in 2008, we will be posting the weekly readings here. Past readings since 2001 are being filled in presently.
Spring 2008
First meeting of the term will be on Thursday, Jan. 31, at noon in NEB 317. Feel free to bring lunch.
Fall 2007
Topics:
- Domain adaptation
- Recent parsing work
- Text compression
- Semisupervised learning
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Sep.26 | Omar F Zaidan | J. Blitzer, R. McDonald, F. Pereira
Domain Adaptation with Structural Correspondence Learning EMNLP 2006 | |
Oct.3 | David Smith | Shai Ben-David, John Blitzer, Koby Crammer, Fernando Pereira. | |
Oct. 10 | Nathaniel W Filardo | Mahoney, Matthew
Adaptive Weighing of Context Models for Lossless Data Compression. Florida Institue of Technology, CS Department, Technical report CS-2005-16 EMNLP-CoNLL 2007 | |
Oct. 17 | Markus Dreyer | Nakagawa, Tetsuji
Multilingual Dependency Parsing Using Global Features EMNLP-CoNLL 2007 | |
Oct. 26 | Christo Kirov | Seginer, Yoav
Fast Unsupervised Incremental Parsing (syntax induction) Proceedings ACL 2007
| |
Nov. 3 | Christo Kirov | I. Titov, J. Henderson
Constituent Parsing with Incremental Sigmoid Belief Networks ACL 2007 | |
Nov. 17 | David Smith | X. Zhu | |
Dec. 12 | Delip Rao | M. Belkin, P. Niyogi
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation ACM 2002 Mikhail Belkin, Partha Niyogi, Vikas Sindhwani |
Summer 2007
Topics:
- Good recent papers (mainly from 2007)
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
May 10 | David Smith | M. Johnson, T. Griffiths, and S. Goldwater
Bayesian Inference for PCFGs via Markov Chain Monte Carlo HLT/NAACL 2007 | |
May 17 | Markus Dreyer | M. Galley, K. McKeown
Lexicalized Markov Grammars for Sentence Compression HLT/NAACL 2007
| |
June 2 | Erin Fitzgerald | J. Jiang, C. Zhai
A Systematic Exploration of the Feature Space for Relation Extraction HLT/NAACL 2007 | |
June 6 | Nikesh Garera | A. Alexandrescu, K. Kirchhoff
Data-Driven Graph Construction for Semi-Supervised Graph-Based Learning in NLP HLT/NAACL 2007 | |
June 14 | David Smith | X. Zhu, Z. Ghahramani,J. Lafferty
Semi-supervised learning using Gaussian fields and harmonic functions. ICML 2003 | |
June 21 | Christopher White | K. Murphy, Y. Weiss, M. Jordan
Propagation for approximate inference: An empirical study. 15th UAI, pages 467-?75, 1999 |
... discussing (loopy) belief propagation as background for survey propagation, a topic which has been getting more attention lately for its ability to "solve very large hard combinatorial problems, such as determining the satisfiability of Boolean formulas.
Chapter 8 of Chris Bishop's textbook is supposed to be a good treatment of graphical models overall. It is available free here [1]. He covers BP in section 8.4.4 after first presenting factor graphs in 8.4.3. David MacKay's treatment of BP, also in terms of factor graphs, is in chapter 26 of his book [2]. It's worth reading this chapter in full, perhaps first reading chapter 16. ... the update equations are given as (26.11) and (26.12) ... [substantial further discussion by jason was here] Some people may prefer Bishop's style, others MacKay's. |
July 6 | Christopher White | A. Braunstein, M. Mezard, R. Zecchina.
Survey propagation: an algorithm for satisfiability. Random Structures and Algorithms, 2005. |
We sent some questions to Zecchina.
Lukas Kroc, Ashish Sabharwal and Bart Selman. Survey Propagation Revisited: An Empirical Study. 23rd UAI, 2007. |
July 18 | David Smith | P. Liang, S. Petrov, M. Jordan, D. Klein
The Infinite PCFG Using Hierarchical Dirichlet Processes. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, | |
Aug. 3 | Yi Su | M. Galley, K. McKeown
Lexicalized Markov Grammars for Sentence Compression. NAACL-HLT 2007 | |
Aug. 11 | Nikesh Garera | L. Shen, G. Satta, A. Joshi.
Guided learning for bidirectional sequence classification ACL 2007 | |
Aug. 18 | Markus Dreyer | D. Talbot, M. Osborne
Randomised Language Modelling for Statistical Machine Translation ACL 2007 |
They use a space-efficient randomized data structure (Bloom Filter) to store very large n-gram models.
There is a companion paper that people might want to have a quick look at as well, for comparison: D. Talbot, M. Osborne Smoothed Bloom Filter Language Models: Tera-Scale LMs on the Cheap ACL 2007 |
Aug. 30 | Delip Rao | Gideon S. Mann
Simple, Robust, Scalable Semi-supervised Learning via Expectation Regularization Proceedings of the 24 th International Conference on Machine Learning 2007
|
Spring 2007
Topics:
- Morphology (unsupervised learning)
- Recent IR/QA papers (with an NLP or multilingual focus)
- Integrating search and learning
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Apr. 19 | John Blatz | A. Prieditis
Machine discovery of Effective Admissible Heuristics Machine Learning Journal, 1993 | |
Apr. 12 | Markus Dreyer | A. Haghighi, J. DeNero and D. Klein
Approximate Factoring for A* Search NAACL-HLT 2007 | |
Mar. 29 & Apr. 5 | Zhifei Li | H. Daume III, J. Langford, and D. Marcu
Search-based structured prediction. Machine Learning Journal, forthcoming | |
Mar. 8 | David Smith | H. Daume III & D. Marcu
Learning as search optimization: approximate large margin methods for structured prediction. ICML 2005 | |
Mar. 1 | Wei Chen | M. Kaisser, S. Scheible, and B. Webber
Experiments at the University of Edinburgh for the TREC 2006 QA track. TREC-15 |
They do some fairly deep interpretation of sentences, extracting their predicate-argument structure. |
Feb. 22 | Eric Harley | K. Kan Lo & W. Lam
Using Semantic Relations with World Knowledge for Question Answering TREC-15 | |
Feb. 15 | Nikhil Bojja | C. Monson et. al.
Unsupervised Induction of Natural Language Morphology Inflection Classes ACL Student Workshop '04 | |
Feb. 8 | Delip Rao | P. Schone and D. Jurafsky
Knowledge-free induction of morphology using latent semantic analysis CoNLL 2000 |
However, there was an extension of this work reported in NAACL-2001 that looks at circumfixes and prefix/affix combinations. [3]
|
Feb. 1 | Nikesh Garera | D. Yarowsky and R. Wicentowski
Minimally supervised morphological analysis by multimodal alignment ACL 2000 |
For more details refer to Chapter 4 of Wicentowski's thesis. |
Fall 2006
Topics:
- Machine learning: Margin methods and structured classification
- Linguistics: Syntactic formalisms
- Syntax-based MT
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Dec. 13 | Delip Rao | J. Carbonell et. al.
Context-based machine translation AMTA 2006 | |
Dec. 6 | Jason Smith | M. Galley et. al.
Scalable Inference and Training of Context-Rich Syntactic Translation Models ACL 2006 |
It may also be helpful to look at:
M. Galley et. al. HLT/NAACL 2004
|
Nov. 29 | Balakrishnan V | D. Marcu et. al.
SPMT: Statistical Machine Translation with Syntactified Target Language Phrases EMNLP 2006 | |
Nov. 15 | Eric Harley | D. Chiang
An introduction to synchronous grammars ACL 2006 Tutorial |
Slides from the talk are also available. [4] |
Nov. 8 | Elliott Drabek | K.Shklovsky
A Grammatical Sketch of Petalcingo Tzeltal Undergraduate Thesis, Reed College, 2005 |
It is 77 pages long, but not dense, and I will be skipping the following sections:
Pages 01-14 Phonetics and phonology 18-18 Polyvalence 21-21 Inherent possession and ... 46-55 Tense and aspect and other sections |
Nov. 1 | Yi Su | M. Steedman
Gapping as Constituent Coordination Linguistics and Philosophy, Vol. 13, 1990, pp.207-264. |
See Yi for photocopies. |
Oct. 25 | Markus Dreyer | S. Reizler et. al.
ACL 2002
| |
Oct. 18 | Erin Fitzgerald | J. Bresnan & R.M. Kaplan
Lexical-Functional Grammar: A Formal System for Grammatical Representation The Mental Representation of Grammatical Relations, MIT Press, 1982 |
BTW, the edited collection that this appears in is generally interesting. Bresnan defends and develops lexicalized grammars in general; the idea of separate surface and semantic roles; and Bresnan & Kaplan's LFG in particular. You should know that she originated (in 1978) the extremely influential idea of lexicalized syntax -- the idea that a grammar is simply a collection of lexical entries to be assembled in standard language-independent ways, but that there are also "lexical redundancy rules" that relate, e.g., active and passive entries for the same verb. Some chapters address morphological and cognitive issues pertaining to lexicalization, including an essay by Pinker on lexicalist learning.
Slides from Erin's presentation can be found here. |
Oct. 11 | John Blatz | L.Xu, D. Wilkinson, F. Southey, & D. Schuurmans
Discriminative Unsupervised Learning of Structured Predictors ICML 2006 | |
Oct. 4 | Nikesh Garera | A. Culotta & J. Sorensen
Dependency Tree Kernels for Relation Extraction ACL 2004 D. Zelenko, C. Aone, & A. Richardella Kernel Methods for Relation Extraction JMLR, Volume 3, 2003 | |
Sept. 27 | David Smith | C. Cortes, P. Haffner, & M. Mohri
NIPS 2003 |
Papers extending rational kernels, including results on positive semidefinite cases, are at:[5]
For the record, and not to be read, is an interesting parallel line of research in Fisher Kernels over strings, e.g. this paper by Saunders, Shawe-Taylor and Vinokourov: [6] |
Sept. 20 | Elliot Drabek | K.Q. Weinberger, F. Sha, & L.K. Saul
Learning a kernel matrix for nonlinear dimensionality reduction ICML 2004 |
S.T. Roweis & L.K. Saul
Nonlinear Dimensionality Reduction by Locally Linear Embedding Science, 22 December 2000 J.B. Tenenbaum, V. De Silva, & J.C. Langford A global geometric framework for nonlinear dimensionality reduction Science, 22 December 2000 |
Sept. 13 | Roy Tromble | L. Xu, J. Neufeld, B. Larson, & D. Schuurmans
NIPS 2004 |
Summer 2006
Spring 2006
Fall 2005
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Sept. 14 | Nikesh Garera | M. Jordan
Statistical Learning Theory Chapter 8 (Exponential family and Generalized linear models) | |
Sept. 21 | Arnab Ghoshal | M. Jordan
Statistical Learning Theory Chapter 2&3 | |
Oct. 20 | Roy Tromble | Sheila M. Reynolds, Jeff A. Bilmes
Part-of-Speech Tagging using Virtual Evidence and Negative Training. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2005. pp 459--466. | |
Oct. 27 | Markus Dreyer | D. Roth and W. Yih
Integer Linear Programming Inference for Conditional Random Fields. ICML '2005
| |
Nov. 4 | Jason Riesa | Luke S. Zettlemoyer, Michael Collins.
Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Proceedings of UAI 2005 | |
Nov. 16 | Safiullah Shareef | Hassan Sawaf, Jörg Zaplo, Hermann Ney | |
Nov. 23 | Roy Tromble | Sutton, Charles and McCallum, Andrew
Composition of Conditional Random Fields for Transfer Learning Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing 2005 |
Summer 2005
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
ddd |
Spring 2005
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
ddd |
Fall 2004
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
ddd |
Summer 2004
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
ddd |
Spring 2004
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
ddd |
Fall 2003
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
ddd |
Spring 2003
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Feb. 13 | David Smith | K. Church
Empirical Estimates of Adaptation: The chance of Two Noriega's is closer to p/2 than p^2 Coling 2000, pp. 173-179
| |
Feb. 19 | Elliott Drabek | A. Lopez, M. Nossal, R. Hwa, P. Resnik
Word-level Alignment for Multilingual Resource Acquisition Proceedings of the 2002 LREC Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data
| |
Feb. 26 | Elliott Drabek | Steven Abney
ACL'02 | |
Mar.6 | Paola Virga | Carl M. Kadie, Christopher Meek, David Heckerman
A Collaborative Filtering System Using Posteriors Over Weights of Evidence Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, 2002.
| |
Mar.20 | Roy Tromble | Nikita Schmid, Ahmed Patel
[ttp://arXiv.org/abs/cs/0201008 Using Tree Automata and Regular Expressions to Manipulate Hierarchically Structured Data] | |
Apr.10 | V. N. Vapnik
The Nature of Statistical Learning Theory, Intro and Chapters 1, 2A | ||
Apr.17 | Roy Tromble | V. N. Vapnik
The Nature of Statistical Learning Theory,Chapters 2B - 4A | |
Apr. 24 | Paola | V. N. Vapnik
The Nature of Statistical Learning Theory, Chapters 4B - 5A | |
May 1 | Noah | V. N. Vapnik
The Nature of Statistical Learning Theory, Chapters 5B - 6A | |
May 8 | Noah | V. N. Vapnik
The Nature of Statistical Learning Theory, Chapters 6B - 7A | |
May 15 | Chal | V. N. Vapnik
The Nature of Statistical Learning Theory, Chapters 7B - |
Fall 2002
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Sep. 10 | Noah A. Smith | Collins, Duffy.
ACL '2002 | |
Sep. 19 | Paola Virga | Yamada, Knight
A decoder for Syntax-based Statistical MT ACL '2002 | |
Sep. 26 | Paul Ruhlen | Hwa, Resnik, Weinberg, Kolak
Evaluating Translational Correspondence using Annotation Projection ACL '2002 | |
Oct. 2 | Gideon Mann | Gildea, Jurafsky
Automatic Labeling of Semantics Roles ACL '2001 | |
Oct. 8 | Elliott Franco Drabek | Ravichandran, Hovy
Learning Surface Text Patterns for a Question Answering System. ACL '2001 |
A similar paper
Lin, Pantel |
Oct. 17 | David Smith | Cotton, Bird
An Integrated Framework for Treebanks and Multilayer Annotations LREC '2002 | |
Oct. 24 | Roy Tromble | Han, Benjamin
Building a Bilingual Dictionary with Scarce Resources: A Genetic Algorithm Approach. | |
Nov. 1 | Chalaporn Hathaidharm | J.Gao, J.Goodman, M.Li, K.Lee
Toward A Unified Approach To Statistical Language Modeling For Chinese ACM Transactions on Asian Language Information Processing, Vol. 1, No. 1, pp 3-33. 2002. | |
Nov. 7 | Neda Khalili | Yamamoto, Church
Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus Computational Linguistics '2001 |
A relative paper:
Kageura |
Nov. 14 | Michelle Vanni | Hearst
ACL '1999 | |
Nov. 21 | Silviu Cucerzan | Ueda, Nakano, Ghahramani, Hinton
SMEM Algorithm for Mixture Models Neural Information Processing Systems '1998 | |
Dec.5 | Silviu Cucerzan | Pearce
A Comparative Evaluation of Collocation Extraction Techniques. Darren Pearce. Third International Conference on Language Resources and Evaluation. May. 2002 D. Lin Automatic identification of non-compositional phrases. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, 317--324.
|
Summer 2002
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
July. 24 | Michelle Vanni | Merlo
A Multilingual Paradigm for Automatic Verb Classification ACL '2002 | |
July. 31 | Paola Virga | Yamada, Knight
A decoder for Syntax-based Statistical MT ACL '2002 |
Spring 2002
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Feb. 7 | Paola Virga | Knight, Graehl
Proceedings of the Thirty-Fifth Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics | |
Feb. 14 | Charles Schafer | Yaser, Germann
Translating with Scarce Resources American Association for Arti�cial Intelligence 2000 | |
Feb. 21 | Jia Cui | Barzilay, McKeown
Extracting Paraphrases from a Parallel Corpus Computer Science Department Columbia.Univ. | |
Feb. 28 | Silviu Cucerzan | Marcu
Towards a Unified Approach to Memory- and Statistical-Based Machine Translation. Annual Meeting of the ACL, Proceedings of the 39th Annual Meeting on Association for Computational Linguistics '2001 | |
Mar. 14 | Noah A. Smith | Ratnaparkhi
A Simple Introduction to Maximum Entropy Models for NLP Institute for Research in Cognitive Science, Univ. of Penn. | |
Mar. 28 | Swapna Somasundaran | Crestan, El-Beze
Improving supervised WSD by including rough semantic features in a Multilevel view of the Context SEMPRO Workshop, Edinburgh, 2001. | |
Apr. 11 | Paola Virga | Neal, Hinton
A view of the EM algorithm that justifies incremental, sparse, and other variants Learning in Graphical Models, 1999 | |
Apr. 18 | Paul Ruhlen | NA. Rao, K. Rose
Deterministically annealed design of hidden Markov model speech recognizers IEEE Trans. on Speech and Audio Processing, vol. 9, (no. 2), Feb. 2001 |
following article builds on the Neal & Hinton paper that we read last week. It tests an incremental version of EM (carefully choosing how incremental it will be), as well as a "lazy EM" version that visits "significant" cases more often. [7] |
Apr. 25 | Paul Ruhlen | H. Al-Adhaileh, Kong, Melamed
Malay-English Bitext Mapping and Alignment Using SIMR/GSA Algorithms Malaysian National Conference on Research and Development on Lingustics '2001 |
Fall 2001
Date/Time | Presenter | Paper(s) | Supporting Papers/Notes |
---|---|---|---|
Dec. 14 | Jia Cui | Bellegarda
Exploiting latent semantic information in statistical language models Proceedings of the IEEE , Volume: 88 Issue: 8 , Aug. 2000 | |
Nov. 29 | Silviu Cucerzan | Mike Collins, Yoram Singer
Unsupervised Models for Named Entity Classification EMNLP/VLC'99 | |
Nov. 20 | Radu Florian | Blum, Mitchell
Combining Labeled and Unlabeled Data with Co-Training Proceedings of 1998 Conference on Computational Learning Theory | |
Nov. 16 | Richard Wicentowski | Eisner, Satta
Efficient parsing for bilexical context-free grammars and head automaton grammars ACL '99 |
plagiarism detection systems might be relevant to bitext alignment. A message to the Corpora list yesterday announced the following review paper:[8] |
Nov. 2 | Paul Ruhlen | Manning, Schuetze
Foundations of Statistical Natural Language Processing, Section 14 on clustering, pp. 495-527. MIT Press | |
Oct. 26 | Gideon Mann | Tishby, Pereira, Bialek | The paper describes a clustering method which is a generalization of their earlier work on "Distributional Clustering of English Words" (pereira,tishby and lee '93). |