NLP News - 2017 Year in Review, 2018 Prognoses, Semi-supervised learning, CTC networks, random forests tutorials, super-human SQuAD, M is Dead, Advances in Pre-training Word Embeddings
Happy New Year to you all! This edition looks back at the past year with the best reviews of 2017 and ahead to 2018. We also have some exciting tutorials on semi-supervised image classification, CTC networks, random forests, and giving a captivating scientific presentation. As always, there are more interesting blog posts, industry highlights, and exciting papers. Enjoy!
2017 Year in Review
A fantastic review of the most important topics in ML and Deep Learning in 2017 by Denny Britz.
Redditors deliberated on the highlights of ML in 2017 including the best paper, best blog post, best blog, best course, etc. Here are the results.
Marek Rei again took the time to analyze the publications in top ML and NLP venues and break them down by individual authors and organisations.
Miles Brundage reviews his 2017 forecasts relating to Reinforcement Learning and Machine Learning in general.
A collection of the top 30 open-source ML libraries in 2017.
Jeff Dean again looks back on the progress the Google Brain team has made in 2017. This is part 1. Part 2 can be found here.
Arthur Pesah answers the Quora question "What are the most significant machine learning advances in 2017?" by giving a comprehensive overview of domain adaptation highlights in 2017, an assessment I personally agree with.
... And Looking Ahead to 2018
Eugene Culurciello expresses his opinions on where Deep Learning and ML are headed in the larger field of AI, and how we can get more and more sophisticated machines that can help.
MIT Tech Review compile a list of five jobs that will see large demand for workers in the future.
Tools and implementations
wav2letter is Facebook's ASR toolkit. It implements the architecture in the Wav2Letter paper and provides models pre-trained on the Librispeech dataset.
Nice review of a class of semi-supervised learning algorithms for computer vision that aim to assign pseudo-labels to unlabeled data points, including a state-of-the-art method, Mean Teacher (NIPS 2017).
A useful tutorial explaining prefix beam search, the combination of beam search with Connectionist Temporal Classification (CTC), a method now part of many end-to-end ASR systems such as Baidu's Deep Speech 2 step-by-step.
Will Ratcliff outlines how to give a good talk by convincing people that it’s worth their mental energy to listen to you. The key to this is exploitation of a simple fact: people are curious creatures by nature and will pay attention to a cool story as long as that story remains absolutely clear.
Slav Ivanov gives a succinct overview of the main functions used for computing feature importance in decision trees and random forests.
More blog posts and articles
Rachel Thomas gives a high-level overview of the five most pervasive trends that should be avoided when founding a startup.
This MIT Tech Review articles describes research in determining whether someone suffers from bipolar disorder using their tweets. The crux is that many sufferers share details of their condition on Twitter, which allows to compile a dataset.
An ensemble model by Alibaba, SLQA+, has achieved the first super-human performance on the Stanford Question Answering Dataset (SQuAD) in terms of F1. We know that state-of-the-art systems trained on SQuAD are susceptible to adversarial examples, but this is still an important milestone.
Facebook announced that it will shutter M, its full-service virtual assistant that was only ever offered to 10k people in SF, on January 19. This announcement highlights a larger trend of text-based chatbots being phased out while voice assistants (Alexa, Google Assistant) prevail.
Uber discusses its Customer Obsession Ticket Assistant (COTA), which uses machine learning and NLP models to help agents deliver improved support experiences.
The time between calling 911 and the ambulance arriving can be critical for saving heart attack victims, but the person on the phone may not know what’s happening: this ML system parses non-verbal clues to help diagnose from a distance.
Lisbon-based Unbabel uses a combination of NMT, quality estimation, other NLP algorithms, and humans for verification to automate translations between more than 70 language combination.
This article gives a comprehensive overview of using autoencoders (AE) for learning features. They provide a taxonomy of models and show how they relate to other classical techniques. In addition, they provide a set of guidelines on how to choose the proper AE for a given task.
Facebook AI Research has over the last two years been focusing on making word embeddings more efficient with fastText. This new paper does propose anything novel but uses a combination known tricks that are however rarely used together: position-dependent vectors (to reweight the word embeddings), phrase embeddings, and subword embeddings. In addition, they use lots of data, which allows them to achieve a new state-of-the-art across intrinsic tasks and on SQuAD. As an added bonus, they provide they make the new pre-trained embeddings available.
Neural architecture search (NAS) has emerged as a useful technique to discover novel deep neural network architectures that achieve excellent results. Salesforce researchers propose to make NAS better suited for generating RNNs by proposing a domain-specific language (DSL), which can produce novel RNNs of arbitrary depth and width. The DSL can define LSTMs and GRUs but also allows non-standard components such as layer normalization and trigonometric curves. They generate architectures with random search with a ranking function (using a recursive NN) and RL. The resulting architectures do not follow human intuitions but perform well on target tasks.