NLP News - SPINN, ∂4, Nested LSTMs, Capsule Networks, Minigo, Matrix Calculus, Past Kaggle Comps, Private Image Analysis, CNN in Google Sheets, AI & Games, IMPALA
Highlights in this edition include: lots of implementations of state-of-the-art models such as SPINN, ∂4, Nested LSTMs, Capsule Networks, and Minigo; useful resources for learning matrix calculus or NLP and searching past Kaggle competitions; tutorials that will teach how to build a domain-specific assistant for Google Home, perform object recognition on encryted data, or train a CNN in Google Sheets; articles about RL such as different applications (trading, games, robotics) and its bias-variance trade-off; and as always exciting research papers.
Presentations and slides
The State of Natural Language Understanding
Slides from Siva Reddy on the state of Natural Language Understanding (NLU). He reviews the history and summarizes current trends, which focus on Question Answering.
Imitation Learning for Structured Prediction in NLP — sheffieldnlp.github.io
The slides from the Imitation Learning tutorial at EACL 2017 by Andreas Vlachos, Gerasimos Lampouras, and Sebastian Riedel. The tutorial presents the various imitation algorithms for structured prediction and shows how they can be applied to different NLP applications.
Implementations and tools
SPINN with TensorFlow eager execution — github.com
An implementation of Stack-Augmented Parser-Interpreter Neural Network (SPINN), a recursive neural network that uses syntactic parse information for NLU in Tensorflow with eager execution.
Differentiable Forth Interpreter — github.com
An implementation of ∂4 from Programming with a Differentiable Forth Interpreter (ICML 2017).
An implementation of Nested LSTMs, a novel RNN architecture with multiple levels of memory.
An implementation of CapsuleNetworks from Dynamic Routing between Capsules (NIPS 2017).
An open-source implementation of the AlphaGoZero algorithm — github.com
Minigo, a pure Python implementation of a neural-network based Go AI in Tensorflow. The implementation is inspired by AlphaGo but the project is not affiliated with DeepMind or AlphaGo.
mltest: Automatically test neural network models in one function call — medium.com
Mltest, a library for automated ML testing in one function call to simply unit testing for ML.
Resources
The Matrix Calculus You Need For Deep Learning — parrt.cs.usfca.edu
An explanation of all the matrix calculus you need in order to understand the training of deep neural networks. Assumes no math knowledge beyond what you learned in calculus 1. By Terence Parr and Jeremy Howard.
Kaggle is a treasure trove of ingenious ways to tackle difficult data science problems in various domains. However, finding the most suitable competition and method can be challenging. This website provides a sortable and searchable compilation of solutions to past Kaggle competitions.
Stanford DAWN Deep Learning Benchmark DAWNBench is a benchmark suite for end-to-end deep learning training and inference, which provides a reference set of common deep learning workloads for quantifying training time, etc. across different optimizers, models, hardware, and platforms. Somewhat counter-intuitively, the most expensive GPU (V100) is the cheapest for training because of reduced training time.
12 of the best free Natural Language Processing and Machine Learning educational resources — blog.aylien.com
A list of some of the best free NLP and ML resources for learning and building expertise.
Tutorials
Practical Deep Learning for Coders 2018 — www.fast.ai
Launch of fastai's Practical Deep Learning for Coders 2018, which follows its 2017 version. The new course uses Pytorch and provides 15 hours of lessons, with about 80% new material.
Using machine learning to build a conversational radiology assistant for Google Home — towardsdatascience.com
A tutorial on how to build a conversational radiology assistant for Google Home, which can assist healthcare providers with their radiology needs in a quick, conversational, hands-free way. Nicely showcases how AI can be used to great effect in specialized domains.
Private Image Analysis with Multi-Party Computation
A comprehensive tutorial on how to leverage multi-party computation (MPC) to train an image analysis model and perform transfer learning on encrypted data.
Building a Deep Neural Net In Google Sheets — towardsdatascience.com
Deep Convolutional Neural Networks can be intimidating. This article demonstrates that they are not and that they can even be implemented in something as superficial as Google Sheets.
Limits of Deep Learning
The Shallowness of Google Translate
Douglas Hofstadter (yes, the Douglas Hofstadter of Gödel, Escher, Bach fame) probes Google Translate to show that it's a long way from real language understanding. An interesting article that is passionate about language, but also quite pessimistic with regard to the usefulness of the current generation of ML and NLP models.
Greedy, Brittle, Opaque, and Shallow: The Downsides of Deep Learning — www.wired.com
A Wired article that takes a harsh look at the deficits of modern AI and seeks to demonstrate that its limits are closer than we think.
Reinforcement Learning
Introduction to Learning to Trade with Reinforcement Learning — www.wildml.com
Trading (particularly of cryptocurrencies) is quite popular at the moment. The academic Deep Learning research community, however, has largely stayed away from the financial markets. In this post, Denny Britz gives a brief intro to trading and argues why trading is an interesting research domain for reinforcement learning.
Artificial Intelligence and Games
A comprehensive book on AI in games by Georgios N. Yannakakis and Julian Togelius that touches on everything from using AI to play games, to generating content, modeling players, and the frontiers ahead. The book can also be bought on Amazon.
IMPALA: Scalable Distributed DeepRL in DMLab-30 — deepmind.com
A blog post that introduces IMPALA (Importance-Weighted Actor-Learner), a new and efficient distributed architecture capable of solving many tasks at the same time as well as DMLab-30, a new set of visually-unified environments designed to test IMPALA and other architectures.
Learning Robot Objectives from Physical Human Interaction — bair.berkeley.edu
An interesting blog post on learning from physical human interaction that argues that robots should treat physical human interaction not as disturbances, but use it to gain information about how they should be doing a task.
Making Sense of the Bias / Variance Trade-off in Reinforcement Learning — medium.com
A detailed blog post that discusses the bias / variance trade-off for (deep) Reinforcement Learning.
Conferences
David Abel already wrote very detailed notes about NIPS 2017. You can find here his very detailed notes about AAAI 2018.
ICLR 2018 accepted papers analysis
An analysis of the papers that have been accepted to ICLR 2018 that provides a break-down across many dimensions, such as institutions and authors.
More blog posts and articles
Requests for Research 2.0 — blog.openai.com
A batch of seven new unsolved problems, which can serve as a fun and meaningful way for new people to enter the field, as well as for practitioners to hone their skills. The problems range from implementing Snake as a Gym environment to learning to transfer between different games.
Discovering Types for Entity Disambiguation — blog.openai.com
An insightful blog post by OpenAI about a new system to automatically disambiguate entities. In contrast to existing approaches, rather than directly linking a surface form to one of 10,000s of entities, the system decides whether the word belongs one of around 100 categories; each category combination is associated with one entity.
Three Weeks with a Chatbot and I’ve Made a New Friend — www.technologyreview.com
A short article about forging a friendship with a chatbot.
How many Mechanical Turk workers are there? — www.behind-the-enemy-lines.com
A blog post that seeks to quantify how many workers there are on Mechanical Turk. TL;DR: About 100k-200k unique workers, with 2K-5K workers active at any given time.
Natural and Artificial Intelligence — inverseprobability.com
Neil Lawrence argues that while recent achievements might give the sense that we have made a breakthrough in understanding human intelligence, all we've achieved for the moment is a breakthrough in emulating intelligence.
Industry insights
Factmata closes $1M seed round as it seeks to build an 'anti fake news' media platform — techcrunch.com
Factmata tries to build a platform using AI to help fix the fake news problem across the whole of the media industry, from the spread of biased, incorrect or just crappy clickbait on various aggregating platforms; to the use of ad networks to help disseminate that content.
Paper picks
Personalizing Dialogue Agents: I have a dog, do you have pets too? (arXiv)
Current chit-chat models for dialogue have many limitations, such as lacking specificity and not being able to display a consistent personality. Zhang et al. seek to alleviate this by conditioning a memory-augmented neural network on a multi-sentence textual description of the person (a 'profile'). They also release a new chit-chat dataset consisting of 164,356 utterances.
Ask the Right Questions: Active Question Reformulation with Reinforcement Learning (ICLR 2018)
Buck et al. propose an approach to reformulate questions that are posed to a QA model to elicit the best response. They frame the problem as a reinforcement learning problem, treating the QA model as the black-box environment.
Generating Wikipedia by Summarizing Long Sequences (ICLR 2018)
Liu et al. introduce a new multi-document summarization task by trying to generate Wikipedia articles from its sources. They first use extractive summarization techniques to identify salient information in each source article and then employ a more scalable decoder to generate the article.