NLP News - Paperclip maximizer, Generative models, Debugging ML, Evolution Strategies, Interpretability, arXiv, ICLR 2018, Multi-hop QA
This edition of NLP News is full of awesome content: Play a benevolent but misguided AI? ✅ Find out how to debug and unit test ML models? ✅ Learn about Evolution Strategies and AlphaGo Zero? ✅ Read case studies on how to apply NLP? ✅ Understand how to make models more interpretable? ✅ Learn more about the role of arXiv for ML and NLP? ✅ Get up-to date on ICLR 2018? ✅ Use cool new datasets on video captioning and multi-hop QA? ✅
Fun and games
If you're interested in AI, chances are that you're aware of Nick Bostrom's famous thought experiment of a paperclip maximizer, a benevolent AI whose sole purpose is to create paperclips, which (spoiler alert) ends up destroying humanity. You can now play the paperclip maximizer yourself and experience the slippery slope of valuing one objective above everything else. The game is simple, addictive, and has even gone viral.
Note: If you're playing the game, there is a known bug that you might not be able to release the hypnodrones (don't ask) due to not having enough in-game memory. Simply open your browser's JS console and type: "memory = memory + 10".
Shelley, world's first collaborative AI horror writer — www.shelley.ai
Just in time for Halloween, you can now co-author horror stories together with Shelley, a language model trained on the eerie stories on r/nosleep. Just respond to the stories she starts every hour on her Twitter account and create the first AI-Human horror anthology ever.
Presentations and slides
Jeff Dean’s Lecture for YC AI — blog.ycombinator.com
Jeff Dean's lecture in front of Y Combinator's AI group and the subsequent QA session touch on many themes that are important to Google and to the ML community more broadly such as learning to learn and multi-task learning.
Deep Generative Models tutorial at UAI 2017
If you're interested in generation or just want to catch up to the latest advances in deep generative models, these slides by Shakir Mohamed and Danilo Rezende are a must-read.
Deep Learning Book Club videos
The Deep Learning Book is a great resource for everyone wanting to learn more about Deep Learning. Going through the book by yourself can be a bit daunting, however. Follow along with these videos of the Deep Learning Book Club and talks of leading researchers for each chapter to help you better understand the contents.
Debugging Machine Learning — www.slideshare.net Michał Łopuszyński gives valuable hints for debugging machine learning models in his presentation at PyDataWarsaw 2017: 1) check your code; 2) check your data; 3) examine your features; 4) examine your data points; 5) examine your model; 6) watch out for overfitting; 6) watch out for data leakage; 7) watch out for covariate shift; 8) remember monitoring & maintenance.
Cool tutorials
How to unit test machine learning code — medium.com
Every software engineer knows unit tests are important. But there are no clear best practices and no solid tutorial for how to unit test ML. Google Brain intern Chase Roberts describes unit tests for common scenarios in this blog post.
TensorBoard tutorial — jhui.github.io
TensorBoard is a great way to visualize your parameters, metrics, or any other data-dependent parameters. This tutorial gives a clear overview of TensorBoard's most important functions.
A Visual Guide to Evolution Strategies — blog.otoro.net
David Ha provides a guide to evolution strategies (ES), a technique that has been shown to be useful for many problems where reinforcement learning (RL) is commonly applied. In contrast to RL, ES uses black-box optimisation and can thus ignore gradient information, which allows it to be evaluated more efficiently.
Case studies
Extracting Tasks from Emails: first challenges — medium.com
Applying NLP to real-world problems bears many challenges, starting with defining the task. This article describes one such case study, the challenges in trying to extract tasks from emails.
How we Changed Unsupervised LDA to Semi-Supervised GuidedLDA — medium.freecodecamp.org
Latent Dirichlet Allocation (LDA) is a useful utensil in the toolbox of every NLP practitioner that allows to reveal themes in a collection of documents. This blog post describes a deficiency of the classic LDA, its inability to specify topics that are known in advance and how GuidedLDA can be used to address this.
Transparency and bias
What to do about biased AI? Going beyond transparency of automated systems — www.alisonpowell.ca
Bias and transparency are becoming more important issues to consider in the design of automated systems. In this post, Alison Powell outlines some of the directions we should be taking to address the problems of ethics, algorithms and accountability.
How to make your data and models interpretable by learning from cognitive science — medium.com
To mitigate the effect of unintended consequences when deploying ML models in real-world applications, designing explainable or interpretable models is one important research direction. Catherine Olsson describes in this post how we can use insights from cognitive science such as using prototypes and criticisms to make our models interpretable.
Natural language processing will help humans and machines have more empathy — venturebeat.com
The benefit of ML and NLP is often argued from a business perspective, citing their ability to improve products and disrupt entire industries. This VentureBeat article takes a more emotional angle: It argues -- somewhat philosophically -- that NLP will not only help us better understand one another, but also give us a better understanding of ourselves.
The right words have benefits: Textio talks perks in job listings, and how ‘leave’ beats ‘vacation’ — www.geekwire.com
Using the right language in job postings has its benefits. Textio takes at look at the best and worst benefits and perks to mention if you want your role to be filled faster.
About the arXiv
ACL Policies for Submission, Review and Citation — www.aclweb.org
*ACL conferences (ACL, NAACL, EACL) and TACL update their policy to protect double-blind review without sacrificing the positive effects of preprint publishing. The main change: Preprints are not allowed to be posted anymore from 1 month before submission to the notification.
Building Brundage Bot — hackernoon.com
A blog post about training BrundageBot, a neural network that keeps up with the latest ML papers on arXiv by predicting which arXiv papers Miles Brundage, "the Michael Jordan of tweeting arXiv preprints" tweeted.
Popularity of arXiv within CS — groups.inf.ed.ac.uk
On the topic of arXiv, this post and paper by Charles Sutton and Linan Gong provides some interesting statistics on the usage of preprints across computer science. For instance, in 2017, fully 23% of papers had e-prints on arXiv, compared to only 1% ten years ago.
ICLR 2018
If you have time this week or next weekend and you want to experience the bleeding-edge of ML, browse through the 1003 submissions of the International Conference on Learning Representations (ICLR 2018). Credit for the above graph goes to Oriol Vinyals.
ICLR 2018 Reproducibility Challenge
Did one of the ICLR 2018 paper's results seem too good to be true or do you want to re-implement the method in your favourite framework? Why not participate in the ICLR 2018 Reproducibility Challenge hosted by Joelle Pineau. Target participants are students taking graduate-level ML courses in Fall 2017.
More articles and blog posts
2D word embedding matrices are commonly used these days, but what about taking things to the third dimension? Stitchfix's Chris Moody shows in this post how we can use 3D word embedding matrices to find clothing items with similar styles.
AI2 Key Scientific Challenges 2017 — allenai.org The Allen Institute for Artificial Intelligence (AI2) makes available ten $10,000 awards to researchers working on key scientific challenges related to facilitating high-impact research in artificial intelligence. Application deadline is November 10.
AlphaGo Zero: Learning from scratch — deepmind.com
AlphaGo Zero, the Go-playing program's fourth (and final) iteration learns Go without any human supervision entirely through self-play and arguably is the strongest Go player in history, defeating the version that defeated Lee Sedol 100-0.
Resources and implementations
SLING - A natural language frame semantics parser — github.com
An open-source implementation of SLING, a parser for annotating text with frame semantic annotations trained using Tensorflow and Google's DRAGNN framework. The paper is here.
Introducing the Natural Language Processing Library for Apache Spark — databricks.com
An NLP library for Apache Spark featuring common NLP utilities such as tokenization and normalization and downstream implementations such as named entity recognition and sentiment analysis.
Industry insights
Woebot: AI for mental health — medium.com
Andrew Ng joins the board of Woebot, a startup that seeks to build a chatbot that helps people deal with mental health issues.
Facebook apologizes after wrong translation sees Palestinian man arrested for posting 'good morning' — www.theverge.com
While machine translation is getting increasingly close to human-level performance, translations (particularly between less common language pairs) should still be taken with a grain of salt. For instance, last week, a Palestinian man was arrested in Jerusalem as his 'Good morning' message posted on Facebook was mistranslated to 'attack them' and 'hurt them'.
Taste Graph part 1: Assigning interests to Pins — medium.com
Brian Johnson, Head of Knowledge at Pinterest describes in this post how Pinterest uses NLP to understand how a person’s interests and preferences evolve over time across different categories.
Paper picks
Generalization in Deep Learning (arXiv)
This year's ICLR 2017 best paper Understanding deep learning requires rethinking generalization reinvigorated interest in gaining a better understanding of the generalization behaviour of deep neural networks. In this tradition, this paper with Yoshua Bengio as co-author seeks to provide new theoretical explanations and new direct analyses for generalization in Deep Learning. It also proposes a new family lf generalization terms that takes these new insights into account.
Generative Adversarial Networks: An Overview (IEEE Signal Processing)
Generative Adversarial Networks are all the rage these days. This paper provides a succinct overview of the most popular current GAN architectures and discusses challenges and directions.
Poincaré Embeddings for Learning Hierarchical Representations (NIPS 2017)
Data such as text often has a hierarchical structure. Existing embeddings in Euclidean space cannot easily model this property. Nickel & Kiela thus propose to learn embeddings in a hyperbolic space, in particular an n-dimensional Poincaré ball. The learned embeddings achieve state-of-the-art performance on determining lexical entailment and require far fewer dimensions than traditional embeddings.
Dataset spotlight
Constructing Datasets for Multi-hop Reading Comprehension Across Documents (arXiv)
Question answering (QA) has seen many improvements in recent years, particularly fuelled by new datasets such as SQuAD. Existing datasets, however, focus on single-hop reading comprehension, i.e. extracting an answer to a question from a given paragraph. Welbl et al. introduce two new datasets for multi-hop reading comprehension, which is much closer the real-world task of open-domain question answering. The datasets require models to first identify relevant documents among a number of candidate documents and then determine the correct answer.
The Spoken Wikipedia Corpora — nats.gitlab.io
The SWC is a corpus of aligned Spoken Wikipedia articles from the English, German, and Dutch Wikipedia. It contains hundreds of hours of aligned audio from a diverse set of readers about a diverse set of topics and is thus a great resource for research in cross-lingual speech recognition.
AVA: A Finely Labeled Video Dataset for Human Action Understanding — research.googleblog.com
Are you interested in multimodal applications, but bored of image captioning? Then try video captioning with Atomic Visual Actions (AVA). AVA is a new dataset that provides multiple action labels for each person in extended video sequences. It consists of URLs for publicly available videos from YouTube, annotated with a set of 80 atomic actions (e.g. "walk", "kick (an object)", "shake hands") with a total of 210k action labels.
PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts (arXiv)
Sentence classification is useful for extracting information in many domains, but labeled data is often not available. PubMed 200k RCT is a new dataset based on PubMed consisting of 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. Each sentence of each abstract is labeled with their role in the abstract, e.g. background, objective, method, etc.