This edition of the newsletter contains tons of interesting articles and resources: A comprehensive glossary of ML terms? ✅ What words are “most hip hop”? ✅ All you ever wanted to know about variational inference? ✅ 15% faster LSTMs in keras? ✅ Which lego set has the most surprising colors? ✅ And lots more…
Facebook Messenger now has LaTeX support!
Facebook Messenger finally has LaTeX support! (We all waited for this, right?) Simply wrap your LaTeX with $$ on each side.
The dataset is a good example of how NLP and computer vision can complement it each other: It was collected by applying NLP to radiology reports to mine them for 8 disease categories. It consists of scans of more than 30,000 patients, including many with advanced lung disease. The paper can be found here and the dataset is available here.
To add to the colourful parsing literature, this GitHub project contains code for the spectral Rainbow Parser for training and decoding with latent-variable probabilistic context-free grammars (L-PCFGs).
Diederik Kingma’s PhD thesis is now online and serves as a key resource for everyone interested in learning more about how variational (Bayesian) inference, generative modeling, and its intersections with Deep Learning. Kingma is the creator of Adam and Variational Autoencoders.
The role of arXiv has been hotly debated in recent months. This is an interesting analysis that adds to the picture. It surveys the strengths of arXiv and its weaknesses and tries to identify possible improvements based on new technologies not previously available.
An article about Naftali Tishby’s theory of the information bottleneck as a means for better understanding the generalization behaviour of deep neural networks. Briefly, Tishby argues that NNs learn via compression and that training generally consists of a short “fitting” phase (where the model learns to label the data) and a much longer “compression” phase.
We can use topic models to explore the topics of articles, expose hidden semantic structures, reveal common themes, and more. Guess what they’re also useful for? To explore the color themes of LEGO sets! Wheee!
Dirk Hovy shares his thoughts on how the recent growth in our field will impact the future of NLP conferences. In particular, he explores how it will affect reviewing, organization, and structure of the actual conference.
Remember DeepL whose MT system seemed to blow the competition out of the water? Pierre Isabel describes how they evaluated DeepL using their challenge set consisting of 108 handcrafted short English sentences that target particular weaknesses of MT systems (see paper). DeepL reduces the error rate compared to Google’s model by 50%!
Krause et al. propose to use dynamic evaluation to improve language models by adapting models to the recent history. They improve the state-of-the-art on Penn Treebank, WikiText-2, and the Hutter prize dataset.
Philipp Koehn, one of the creators of phrase-based machine translation, shares his 117-pages long draft chapter on neural machine translation. A great way to get started with NMT for beginners; for experts, the chapter on current challenges still provides plenty food for thought.
An overview of Edina, the University of Edinburgh’s entry for the Amazon Alexa Prize competition. The main novelty lies in the use of self-dialouges, which are conversations that were created by a single Amazon Mechanical Turk worker playing both participants in a dialogue. The complete model consists of a rule-based model backing off to a matching score, backing off to a generative neural network.