View profile

TensorFlow 2.0, PyToch Dev Conference, DecaNLP, BERT, Annotated Encoder-Decoder, ICLR 2019 reading, v1, AllenNLP v0.7, 10 writing tips, AutoML & Maths for ML books, TensorFlow NLP best practices

Hey all, Welcome to this month's newsletter edition, which includes some cool video content about Ten
October 15 · Issue #33 · View online
NLP News
Hey all,
Welcome to this month’s newsletter edition, which includes some cool video content about TensorFlow and PyTorch; in-depth content about encoder-decoders; BERT, probably the hottest encoder at the moment 🔥; ICLR 2019 reading suggestions; and AllenNLP news; 10 tips to make you a more productive scientific writer; lots of resources including open-access books on AutoML and maths for ML; TensorFlow best practices for NLP; and many tools, articles, and blog posts.
I really appreciate your feedback, so let me know what you love ❤️ and hate 💔 about this edition. Simply hit reply on the issue.
If you were referred by a friend, click here to subscribe. If you enjoyed this issue, give it a tweet 🐦.

Talks and presentations 🗣
The Natural Language Decathlon: Multitask Learning as Question Answering 🏅 Richard Socher talks about their recently published DecaNLP task. He discusses the limits of single-task learning. NLP requires many types of reasoning (logical, linguistic, emotional, visual, etc.) If you’re interested in multi-task learning, then this is a talk to watch.
TensorFlow 2.0 Changes 🏛 Aurélien Géron draws side-by-side comparisons between the upcoming TensorFlow 2.0 and Pytorch. TensorFlow 2.0 will make Eager mode a lot more prominent and will enable seamless switching between Eager and Graph mode. Sharing weights will get a lot easier (and more like Keras) and tf.contrib will get cleaned up. In all, lots to look forward to!
PyTorch Dev Conference Part 1 👩‍💻 The first PyTorch dev conference featured talks from Andrej Karpathy, AI2’s Mark Neumann, fastai’s Rachel Thomas, and many others. Another highlight is the Future of AI Software panel with Soumith Chintala, Jeremy Howard, Noah Goodman, and others.
What's in an encoder-decoder? 🤖
The Annotated Encoder Decoder 📝 The encoder-decoder with attention, which goes back to the seminal Sequence-to-sequence learning paper and the subsequent improvement with attention, is a staple of current NLP tasks. Joost Bastings provides an annotated walk-through of the encoder-decoder, similar to the excellent Annotated Transformer. On the topic of Transformers, check out BERT, a bidirectional Transformer language model that achieved state-of-the-art across 11 NLP tasks in the Paper picks section below.
Towards Natural Language Semantic Code Search 💻 Beyond learning representations from text, we can also use an encoder-decoder to learn representations from code by predicting doc strings. GitHub Engineering describes how such a model can be used for semantic code search.
ICLR 2019 impressions
New and AllenNLP libraries v1 The new version of the library provides a single interface to the most commonly used deep learning applications for vision, text, tabular data, time series, and collaborative filtering. In addition, announced the launch of a new course.
AllenNLP v0.7 The new version of AllenNLP provides a new framework for training state-machine-based models and several examples of using this for semantic parsing as well as a model for neural open information extraction, and a graph-based semantic dependency parser. They’ve also released new tutorials, which are simply beautiful to look at.
Improving your writing productivity
This article describes ten simple rules for scientists to make you a more productive writer:
  1. Define your writing time
  2. Create a working environment that really works
  3. Write first, edit later
  4. Use triggers to develop a productive writing habit
  5. Be accountable
  6. Seek feedback and ask for what you want
  7. Think about what you’re writing outside of your scheduled writing time
  8. Practice, practice, practice
  9. Manage your self-talk about writing
  10. Reevaluate your writing practice often
Tools and implementations ⚒
jiant sentence representation learning toolkit 🔨 This toolkit was created at the 2018 JSALT Workshop by the General-Purpose Sentence Representation Learning team and can be used to run experiments that involve multitask and transfer learning across sentence-level NLP tasks.
What-If Tool 🔦 This tool enables the inspection of an ML model inside Tensorboard. It allows us to visualize results and to explore the effects of single features and counterfactual examples (i.e. the most similar example with a different prediction). It’s most suitable for analyzing algorithmic fairness.
Resources 📚
Good practices in Modern Tensorflow for NLP 🏋️ This notebook contains many best practices for doing NLP with TensorFlow, such as using feeding and transforming data with, preprocessing, and model serving.
AutoML book 🤖 Automated machine learning (AutoML) encompasses much more than just Google’s architecture search efforts. The open-source chapters from this book from one of the top AutoML groups will give you an overview of automatic hyperparameter optimization, meta learning, neural architecture search, as well as individual AutoML systems.
Maths for ML book 📋 This book aims to provide the necessary mathematical skills to read more advanced ML books and does so in a succinct and accessible manner.
How to visualize decision trees 🌳 This article is a master class in how decision trees can be visualized. In addition, it provides insights into the design of a visualization library.
Counterfactual Regret Minimization ♣️ This article gives an in-depth overview of counterfactual regret minimization, which lies at the heart of DeepStack and Libratus who both recently defeated pros in Heads Up No-Limit Texas Hold’em.
Articles and blog posts 📰
How AI technology can tame the scientific literature 👩‍🔬 This article gives an overview of the current landscape of tools that allow information extraction from scientific literature and explores how AI can be used to automatically generate and validate hypotheses.
Here’s What You Need To Know About ‘Artie’s Adventure,’ The VR/AI Experience Google Just Announced 🐶 This article explores how AI can bring deeper emotional engagement to virtual experiences by being used to power characters and allowing the creators to focus on emotions.
Welcome to Voldemorting, the Ultimate SEO Dis🕴 This is a beautiful Wired article about the recent online practice of voldemorting, i.e. replacing a name with euphemisms or synonyms to deprive someone of attention online, and how it is used in the online world. If you like puns and wordplay, this article is for you. Also: Who wants to create a voldemorting dataset/generator?
Career advice for recent Computer Science graduates 👩‍💻 Chip Huyen gives an overview of the pros and cons between choosing a PhD, working for a startup, and working for a big company and describes the factors that influenced her personal choice.
Publishing Negative Results in Machine Learning is like Proving Dragons don’t Exist 🐉 This short article describes why publishing negative results is hard and when they are actually publishable.
Machine Translation. From the cold war to Deep Learning ❄️ This article guides us through the history of machine translation, from its beginnings during the cold war, to statistical and phrase-based MT, to the current Deep Learning-based systems.
Paper picks
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding This paper shows that we still have not reached the ceiling with regard to language model pretraining. In particular, using a more expressive encoder (a bidirectional Transformer rather than a unidirectional one) and a deeper model (24 layers) achieve large gains. It is a striking example of what can be achieved with a well-executed pretrained language model. Among other results, the model achieves large improvements on SQuAD and super-human performance on SWAG, a benchmark for commonsense inference that was introduced just a couple of months ago. Have a look here for some comments from the author on Reddit.
Multi-Task Learning as Multi-Objective Optimization (NIPS 2018) This paper casts multi-task learning as multi-objective optimization with the overall objective to find a Pareto optimal solution. Existing algorithms from the gradient-based multi-objective optimization literature scale poorly with the dimensionality of gradients and the number of tasks. Instead, the authors propose an upper bound on the loss, which can be optimized efficiently and prove that it yields a Pareto optimal solution under realistic assumptions. They evaluate on digit classification, scene understanding, and multi-label classification. Overall, this is a nice paper that brings a new principled perspective to the current multi-task learning landscape.
Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model This paper proposes a Bayesian inspired language model. The model first generates an embedding for each lexeme (token) using a language model; it then generates a spelling using a second character-based language model based on that embedding. This approaches is motivated by the duality of patterning, i.e. that the form of a word is separate from its usage. They deal with the open vocabulary by predicting UNK as another lexeme, conditioning the spelling on the hidden state of the model.
Did you enjoy this issue?
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue