Hi all, This newsletter's spotlight topics are GPT-2, OpenAI's recent language model, and sequence ge
|
March 11 · Issue #38 · View online |
|
Hi all, This newsletterās spotlight topics are GPT-2, OpenAIās recent language model, and sequence generation in arbitrary order. Besides these, there are again lots of resources, tools, articles, blog posts, and papers to explore.
Some personal news š° I have defended my PhD and joined Google DeepMind in London. Iām planning to continue writing this newsletter every month, but future editions might be more compact.
ContributionsĀ šŖ If you have written or have come across something that would be relevant to the community, hit reply on the issue so that it can be shared more widely. I really appreciate your feedback, so let me know what youĀ love ā¤ļø and hate š about this edition. Simply hit reply on the issue. If you were referred by a friend, clickĀ hereĀ to subscribe. If you enjoyed this issue,Ā give it aĀ tweetĀ š¦.
|
|
|
|
|
If you don't completely get this meme, read the below paragraph first. Credit: Greg Durrett
|
OpenAI decided not to release the parameters of the pretrained model citing potential malicious use (such as generating fake news, impersonation, etc). This decisionāwhich was accompanied by news articles from The Verge, Wired, The Register, and others with fear-mongering headlines such as ā The AI Text Generator Thatās Too Dangerous to Make Public"āsparked controversy online. Many ML and NLP experts such as Anima Anandkumar, Delip Rao, Jeremy Howard, Hugh Zhang, Zachary Lipton, Robert Munro, Ryan Lowe, and Oren Etzioni took position in dedicated posts, which are well worth reading if you are interested in the potential for malicious use of the current level of NLP technology. TWiML&AIās Sam Charrington also hosted a panel on the same controversy that is worth listening to. Personally, I think openness is critical for AIās continued progress (in terms of accessing papers, sharing data, replicating experiments, and releasing models). Going against this openness prevents good actors from developing defenses and prevents the research community from better understanding the model. It also sets a precedent that will slow progress. Having a discussion about malicious use cases is useful, but we are missing crucial information if we are not allowed to evaluate and assess this potential. I hope OpenAI continues to engage with the community and that this wonāt be the end of the conversation.
|
|
If multiple research labs come up with a similar idea concurrently, then itās often worth taking a closer look. One such recent idea is generating text in arbitrary order. With the advent of BiLSTMs, encoders could process text both from left to right and in reverse order. More recently, self-attention models such as the Transformer do not prescribe any particular order at all, but enable looking at all relevant words at once. However, decoders are still required to generate text one token at a time from left to right. Letās look at the three recent papers on this topic in more detail:
Non-Monotonic Sequential Text Generation Welleck et al. propose a method that generates a word at an arbitrary position and then recursively generates words to its left and right in a binary tree. The model is trained with imitation learning. While they differ based on the execution of the method, all three follow the same ideaāallowing an arbitrary generation order via insertions based on some form of binary tree. These ideas are also similar to non-autoregressive NMT, an exciting approach from last year that proposed to generate all output words in a sentence in parallel. A related idea is to generate a sentence and then edit it iteratively ( Guu et al., TACL 2018; Wang et al., ACL 2018). Weāll likely see more approaches experimenting not only in the way the input should be processed, but also how the output should be produced by a model. Some of these might be closer to the way humans write textāfor instance by starting with a main message or a sketch and then iteratively expanding it onāand might enable novel interactive applications.
|
|
|
|
Lingvo š A Tensorflow framework for sequence modelling (ālingvoā means ālanguageā in Esperanto), which started out with a focus on NLP; it also supports distillation, GANs, and multi-task models. Many recent state-of-the-art NLP and speech papers have been implemented in Lingvo.
scispacy š¬ A Python package containingĀ spaCyĀ models for processing biomedical,Ā scientificĀ orĀ clinicalĀ text.
mindsdb š A Python framework that strives for simplicity, certainty, and explainability in training neural networks. It enables users to train models and provides them with information to understand when they can trust the predictions that theyāve made.
LIGHT š® A large-scale fantasy text adventure game research platform for training agents that can both talk and act, interacting either with other models or with humans.
GiantĀ Language modelĀ TestĀ Room š©āš¬ A tool to inspect the visual footprint of a language model on input text to detect whether a text could be real or fake.
|
|
10 breakthrough technologies š¬ MIT features 10 breakthrough technologies in 2019āaccording to Bill Gates. One of them discusses āsmooth-talking AI assistantsā, AI assistants that can perform conversation-based tasks like booking a restaurant reservation or coordinating a package drop-off. The post estimates them to be available in 1-2 years, which seems reasonable for narrow domains.
How to Choose Your First AI Project š¤ Andrew Ng gives valuable tips on how a company should choose its first AI project, such as choosing an initial project that can be done quickly and has a high chance of success in order to get the flywheel turning as soon as possible.
Character Level NLP š¤ An in-depth blog post about the advantages and drawbacks of working at the character level in NLP.
Your Next Game Night Partner? A Computer šÆ This article describes AI2ās recent agent, which plays a Pictionary-style game collaboratively with a human partner. Unlike automated players in board games like chess or Go, AI2ās player communicates using pictures, phrases, and concepts. You can play the game yourself here.
Learning through Auxiliary Tasks š©āš In this post, Vivien gives an overview of learning with auxiliary tasks and introduces a simple gradient-based approach that outperforms comparison multi-task approaches and mitigates negative transfer.
Exploring BERTās Vocabulary š©šŖš¬š§š«š· Judit Ćcs analyzes BERTās multilingual word piece vocabularyāin particular, the consequences of using a shared word piece vocabulary across many languages.
Yann LeCun Cake Analogy 2.0 š An update on Yann LeCunās infamous cake analogy at NIPS 2016. In the new image, unsupervised learning is replaced by self-supervised learning. Language modelling, for instance, can be considered as an example of self-supervised learning.
Conversational AI ā but where is theĀ I? š¬ Nikolai Rozanov argues that simply exhaustively enumerating all possibilities in narrow domains is not enough and that we need to solve the hard problems of conversational understanding.
|
|
Influence diagram of Deep Learning methods used in video games
|
Deep Learning for Video Game Playing (arXiv 2019) In this extensive review, Justesen et al. describe how recent advances in Deep Learning have been applied to video games. Definitely worth a read if youāre interested in reinforcement learning or video games.
- certain examples are forgotten with high frequency; others not at all;
- a datasetās (un)forgettable examples generalize across architectures;
- a significant fraction of examples can be omitted from the training data based on forgetting dynamics.
These results may help us gain a better understanding which examples are useful for neural networks and to mitigate catastrophic forgetting.
|
Did you enjoy this issue?
|
|
|
|
If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
|
|
|
|