NLP News - SPINN, ∂4, Nested LSTMs, Capsule Networks, Minigo, Matrix Calculus, Past Kaggle Comps, Private Image Analysis, CNN in Google Sheets, AI & Games, IMPALA

Feb 12, 2018

Highlights in this edition include: lots of implementations of state-of-the-art models such as SPINN, ∂4, Nested LSTMs, Capsule Networks, and Minigo; useful resources for learning matrix calculus or NLP and searching past Kaggle competitions; tutorials that will teach how to build a domain-specific assistant for Google Home, perform object recognition on encryted data, or train a CNN in Google Sheets; articles about RL such as different applications (trading, games, robotics) and its bias-variance trade-off; and as always exciting research papers.

Presentations and slides

The State of Natural Language Understanding

Slides from Siva Reddy on the state of Natural Language Understanding (NLU). He reviews the history and summarizes current trends, which focus on Question Answering.

Imitation Learning for Structured Prediction in NLP — sheffieldnlp.github.io

The slides from the Imitation Learning tutorial at EACL 2017 by Andreas Vlachos, Gerasimos Lampouras, and Sebastian Riedel. The tutorial presents the various imitation algorithms for structured prediction and shows how they can be applied to different NLP applications.

Implementations and tools

SPINN with TensorFlow eager execution — github.com

An implementation of Stack-Augmented Parser-Interpreter Neural Network (SPINN), a recursive neural network that uses syntactic parse information for NLU in Tensorflow with eager execution.

Differentiable Forth Interpreter — github.com

An implementation of ∂4 from Programming with a Differentiable Forth Interpreter (ICML 2017).

Nested LSTM Cell

An implementation of Nested LSTMs, a novel RNN architecture with multiple levels of memory.

Capsule Networks

An implementation of CapsuleNetworks from Dynamic Routing between Capsules (NIPS 2017).

An open-source implementation of the AlphaGoZero algorithm — github.com

Minigo, a pure Python implementation of a neural-network based Go AI in Tensorflow. The implementation is inspired by AlphaGo but the project is not affiliated with DeepMind or AlphaGo.

mltest: Automatically test neural network models in one function call — medium.com

Mltest, a library for automated ML testing in one function call to simply unit testing for ML.

Resources

The Matrix Calculus You Need For Deep Learning — parrt.cs.usfca.edu

An explanation of all the matrix calculus you need in order to understand the training of deep neural networks. Assumes no math knowledge beyond what you learned in calculus 1. By Terence Parr and Jeremy Howard.

Kaggle Past Competitions

Kaggle is a treasure trove of ingenious ways to tackle difficult data science problems in various domains. However, finding the most suitable competition and method can be challenging. This website provides a sortable and searchable compilation of solutions to past Kaggle competitions.

Stanford DAWN Deep Learning Benchmark DAWNBench is a benchmark suite for end-to-end deep learning training and inference, which provides a reference set of common deep learning workloads for quantifying training time, etc. across different optimizers, models, hardware, and platforms. Somewhat counter-intuitively, the most expensive GPU (V100) is the cheapest for training because of reduced training time.

12 of the best free Natural Language Processing and Machine Learning educational resources — blog.aylien.com

A list of some of the best free NLP and ML resources for learning and building expertise.

Tutorials

Practical Deep Learning for Coders 2018 — www.fast.ai

Launch of fastai's Practical Deep Learning for Coders 2018, which follows its 2017 version. The new course uses Pytorch and provides 15 hours of lessons, with about 80% new material.

Using machine learning to build a conversational radiology assistant for Google Home — towardsdatascience.com

A tutorial on how to build a conversational radiology assistant for Google Home, which can assist healthcare providers with their radiology needs in a quick, conversational, hands-free way. Nicely showcases how AI can be used to great effect in specialized domains.

Private Image Analysis with Multi-Party Computation

A comprehensive tutorial on how to leverage multi-party computation (MPC) to train an image analysis model and perform transfer learning on encrypted data.

Building a Deep Neural Net In Google Sheets — towardsdatascience.com

Deep Convolutional Neural Networks can be intimidating. This article demonstrates that they are not and that they can even be implemented in something as superficial as Google Sheets.

Limits of Deep Learning

The Shallowness of Google Translate

Douglas Hofstadter (yes, the Douglas Hofstadter of Gödel, Escher, Bach fame) probes Google Translate to show that it's a long way from real language understanding. An interesting article that is passionate about language, but also quite pessimistic with regard to the usefulness of the current generation of ML and NLP models.

Greedy, Brittle, Opaque, and Shallow: The Downsides of Deep Learning — www.wired.com

A Wired article that takes a harsh look at the deficits of modern AI and seeks to demonstrate that its limits are closer than we think.

Reinforcement Learning

Introduction to Learning to Trade with Reinforcement Learning — www.wildml.com

Trading (particularly of cryptocurrencies) is quite popular at the moment. The academic Deep Learning research community, however, has largely stayed away from the financial markets. In this post, Denny Britz gives a brief intro to trading and argues why trading is an interesting research domain for reinforcement learning.

Artificial Intelligence and Games

A comprehensive book on AI in games by Georgios N. Yannakakis and Julian Togelius that touches on everything from using AI to play games, to generating content, modeling players, and the frontiers ahead. The book can also be bought on Amazon.

IMPALA: Scalable Distributed DeepRL in DMLab-30 — deepmind.com

A blog post that introduces IMPALA (Importance-Weighted Actor-Learner), a new and efficient distributed architecture capable of solving many tasks at the same time as well as DMLab-30, a new set of visually-unified environments designed to test IMPALA and other architectures.

Learning Robot Objectives from Physical Human Interaction — bair.berkeley.edu

An interesting blog post on learning from physical human interaction that argues that robots should treat physical human interaction not as disturbances, but use it to gain information about how they should be doing a task.

Making Sense of the Bias / Variance Trade-off in Reinforcement Learning — medium.com

A detailed blog post that discusses the bias / variance trade-off for (deep) Reinforcement Learning.

Conferences

AAAI 2018 Notes

David Abel already wrote very detailed notes about NIPS 2017. You can find here his very detailed notes about AAAI 2018.

ICLR 2018 accepted papers analysis

An analysis of the papers that have been accepted to ICLR 2018 that provides a break-down across many dimensions, such as institutions and authors.

Industry insights

Factmata closes $1M seed round as it seeks to build an 'anti fake news' media platform — techcrunch.com

Factmata tries to build a platform using AI to help fix the fake news problem across the whole of the media industry, from the spread of biased, incorrect or just crappy clickbait on various aggregating platforms; to the use of ad networks to help disseminate that content.

Paper picks

Personalizing Dialogue Agents: I have a dog, do you have pets too? (arXiv)

Current chit-chat models for dialogue have many limitations, such as lacking specificity and not being able to display a consistent personality. Zhang et al. seek to alleviate this by conditioning a memory-augmented neural network on a multi-sentence textual description of the person (a 'profile'). They also release a new chit-chat dataset consisting of 164,356 utterances.

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning (ICLR 2018)

Buck et al. propose an approach to reformulate questions that are posed to a QA model to elicit the best response. They frame the problem as a reinforcement learning problem, treating the QA model as the black-box environment.

Generating Wikipedia by Summarizing Long Sequences (ICLR 2018)

Liu et al. introduce a new multi-document summarization task by trying to generate Wikipedia articles from its sources. They first use extractive summarization techniques to identify salient information in each source article and then employ a more scalable decoder to generate the article.

NLP News