Fast.ai; Google IO; Semantic segmentation, object detection, network graph overviews; algorithms vs. compute; AI perspectives; MT; semantic similarity; Goodfellow, Schmidhuber, & Kaggle #1 profiles; Maths for ML; synthetic data; uncertainty for dialogue
Are you still stressed about your NIPS submission, anxious about the EMNLP deadline or just annoyed because---let's face it---it's Monday again? Grab a coffee ☕️ and relax with this fortnight's edition!
This time, we have: overviews of semantic segmentation, object detection models, and network graph methods; two articles on the showdown between algorithms vs. compute; essays with different perspectives on AI, from fear and caution to cooperation; articles on advances in machine translation and semantic similarity; profiles of Ian Goodfellow, Jürgen Schmidhuber, and the new #1 Kaggler; a new Maths for ML book; industry news including using synthetic training data and uncertainty for dialogue; and research papers from ACL 2018.
What's hot 🔥
Google IO: Google Duplex garnered a lot of attention and unsettled some; it will identify itself in the future. Google redesigned its news app. Gmail now allows you to write 'smart' longer responses. Google Research is now Google AI, which caused some confusion.
Tutorials and overviews
Going beyond the bounding box with semantic segmentation — thegradient.pub How do humans describe a scene? We might say that there’s a table under the window, or that there’s a lamp to the right of the couch. Decomposing scenes into separate entities is key to understanding images, and it helps us reason about the behavior of objects.
Qiuyu Chen reviews the evolution of the state-of-the-art object detectors and their limitations that need to be solved for further progress.
How do we capture structure in relational data? — thegradient.pub Network graphs of interactions and friendships on social media are rich with useful insights. Synonym graphs like WordNet can help us better identify related objects in a scene with computer vision. From family trees to molecular structures, an enormous amount of information around us takes the form of graphs.
Algorithms vs. compute
A Verge article that shows that being smart about your algorithms still allows you to compute with the raw computing power of the tech giants.
Somewhat as an antithesis to the above, OpenAI shows that since 2012, the amount of compute for the largest AI training runs (think: AlphaGo) has increased by more than 300,000x. Note that this analysis focuses on the largest runs; the compute for the average run has likely not increased by that much.
Perspectives on AI
Tad Friend writes that thinking about artificial intelligence can help clarify what makes us human—for better and for worse.
The old story of AI is about human brains working against silicon brains. The new story of IA will be about human brains working with silicon brains (a centaur here is a human+AI pair). As it turns out, most of the world is the opposite of a chess game: Non-zero-sum—both players can win.
Judea Pearl, a pioneering figure in artificial intelligence, argues that AI has been stuck in a decades-long rut. His prescription for progress? Teach machines cause and effect.
This article argues that philosophically, intellectually—in every way—human society is unprepared for the rise of artificial intelligence—and that we'd better change this fast.
Researchers from the University of Maryland and CMU outline how technology could be used to support the work of simultaneous interpreters. They propose analyzing the linguist’s performance on the fly via a real-time quality feedback loop and offer help only when the interpreter is struggling.
Microsoft discusses a new NAACL 2018 paper in which they tackle the challenge of insufficient parallel data in MT and propose an approach that requires only a few thousand parallel sentences for an extremely low-resource language to achieve a high-quality machine translation system.
Google summarizes results of two recent papers on learning semantic textual similarity: In the first one, they learn representations by predicting responses in a conversation; in the second one, they propose to use a Transformer for sentence encoding.
This post provides a nice overview of universal word and sentence embeddings methods that can be used for transfer learning. Note: ULMFiT came out a day after the post.
People in ML
Ian Goodfellow discusses personal failures and machine learning's relationship to failure.
Jürgen Schmidhuber says he’ll make machines smarter than us. His peers wish he’d just shut up. This article gives a nice overview of the role of Jürgen Schmidhuber, including the fabled origin of the term 'Schmidhubered'.
Shubin Dai, better known as Bestfitting, is the new #1 on the Kaggle leaderboard. Having started with Kaggle only two years ago, he shares some of the secrets to his success (e.g. cross-validation is super important!).
Maths is crucial for machine learning, but many introductions skimp on it. This book (which is in progress) aims to provide the necessary mathematical skills to read those other books and thus seems like a welcome member of the ML book canon.
More articles and blog posts
A New York Times article on a new adversarial attack on speech recognition systems that was previously discussed in this The Gradient article.
A new project using AI and OCR tries to untangle the handwritten texts in one of the world’s largest historical collections.
Researchers from DeepMind train a model to perform path integration, i.e. calculating one's current position by using a previously determined position. They show that the model's representations are similar to grid cells, neurons arranged in hexagons that are thought to be responsible for navigation in the brain.
NVIDIA shares recent advancements that deliver dramatic performance gains on GPUs to the AI community, including a new ResNet-50 performance record for a single chip and single server.
Alexa developers get 8 free voices to use in skills, courtesy of Amazon Polly — techcrunch.com Amazon is offering a way for developers to give their voice apps a unique character with the launch of eight free voices to use in skills, courtesy of the Amazon Polly service. The voices are only available in U.S. English, and include a mix of both male and female, according to Amazon Polly’s website.
AI startup Gamalon says that they developed a better way for chatbots and virtual assistants to converse with us by incorporating uncertainty. As far as I know, Gamalon has not published any papers, so it's not clear if and how exactly their approach improves upon current methods.
This article discusses advances in stance and fake news detection at Dublin-based AI startup Aylien. For technical details, see our paper on 360° Stance Detection. (Disclaimer: I'm an employee at Aylien.)
The visual data sets of images and videos amassed by the most powerful tech companies have been a competitive advantage, a moat that keeps the advances of machine learning out of reach from many. This advantage will be overturned by the advent of synthetic data. For instance, AiFi, a company seeking to create a checkout-free store like Amazon Go, creates large-scale store simulations to train deep learning models.
Facebook Adds A.I. Labs in Seattle and Pittsburgh, Pressuring Local Universities — www.nytimes.com Salaries for artificial intelligence researchers at big tech companies are skyrocketing, luring many professors.
Artetxe et al. propose a new unsupervised method for learning cross-lingual embeddings that builds on the self-learning method of previous work. The main insights are that self-learning requires an initialization that is better than random and that the distribution of similarity values of words that are translations of each other is similar. We can leverage the latter as initialization for unsupervised self-learning.
Joshi et al. show that transfer learning via contextualized word representations can help adapt parsers to similar domains. For more syntactically distant domains, annotators can selectively annotate important spans and the model can be trained to classify whether a span is a constituent.
This paper proposes a sequence-to-sequence model with attention that takes a title as input and automatically generates a scientific abstract by iteratively refining the generated text. The system fools junior domain experts at a rate of up to 30% and non-experts at a rate of up to 80%.