Hi all, It feels like quite a lot has been going on in the last two weeks. Consequently, this newslet
| |
| June 25 · Issue #26 · View online |
|
|
Hi all,
It feels like quite a lot has been going on in the last two weeks. Consequently, this newsletter is also more packed than usual. So lean back with your beverage of choice ☕️🍵🍺 and let me take you through some of what happened in the world of ML and NLP.
Highlights There’s been so much cool stuff, it’s hard to pick favourites. For slides and talks, my highlights are the chat with Christopher Olah about interpreting neural networks and Andrej Karpathy’s talk about Software 2.0; the NMT with attention Colaboratory notebook is pretty cool; there’s also an awesome in-depth resource about gradient boosting; two overviews of Defense Against the Dark Arts 🔮; some cool articles on interpretability and bias; articles about RL and scene understanding; and lots more articles and papers!
|
|
|
|
|
|
|
|
Machine Learning Research & Interpreting Neural Networks
Watch this Coffee with a Googler episode with Christopher Olah to get a deep dive into Distill, Lucid, and Deep Dream. If you’re not excited about visualizing features and neural networks dreaming, you will be after watching this episode.
|
Christopher D. Manning: A Neural Network Model That Can Reason (ICLR 2018 invited talk)
Watch Christopher Manning talk about Memory-Attention-Composition Networks at ICLR 2018.
|
Deep Learning for NLP slides - Kyunghyun Cho
Kyunghyun Cho gave an 8 hour course on Deep Learning for NLP in Korean. The lectures were in Korean, but the slides are in English and available here.
|
Building the Software 2.0 Stack by Andrej Karpathy
If you’re into ML, you’ve likely ready Andrej Karpathy’s article on Software 2.0 (if not, go read it now). In this talk, Karpathy talks about Software 2.0 more in-depth and about his experience building the Software 2.0 Stack at Tesla.
|
Modelling Natural Language, Programs, and their Intersection (NAACL 2018 Tutorial)
If you’re interested in ML models of programs, check out the slides of this tutorial by Graham Neubig and Miltos Allamanis. After going through the tutorial, if you want to get your hands dirty, have a look at CoNaLa, the Code/Natural Language Challenge out of Graham’s lab.
|
Efficient Deep Learning with Humans in the Loop
Zachary Lipton discusses techniques to apply NLP models to problems without large labeled datasets by relying on humans in the loop.
|
|
|
|
Code and model for the Fine-tuned Transformer by OpenAI
|
Neural Machine Translation with Attention
This Colaboratory notebook trains a sequence to sequence (seq2seq) model for Spanish to English translation using tf.keras and eager execution.
|
|
Code for Emergent Translation in Multi-Agent Communication
|
NCRF++: An Open-source Neural Sequence Labeling Toolkit
NCRF++ (see ACL demo paper) is a PyTorch based framework with flexible choices of input features and output structures for NLP sequence labeling tasks. The model design is fully configurable through a configuration file, which does not require any code work.
|
|
|
How to explain gradient boosting
An in-depth explanation of gradient boosting machines by Terence Parr and Jeremy Howard with lots of examples 👏🏻. If you ever wanted to really understand gradient boosting, this is the resource to read.
|
Tracking the Progress in Natural Language Processing
A repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks (disclaimer: created by me).
|
fastdeepnets Research Journal
One of the hardest things to teach about doing research is how to come up with new hypotheses, validate, and iterate on them. Guillaume Leclerc provides a great example of this, laying out the different steps for his Master thesis. If you want to skip ahead, you can find the final paper here.
|
|
|
Defense Against the Dark Arts: An overview of adversarial example security research and future research directions
Slides (with notes!) from the top auror himself (Ian Goodfellow) on how to defend against adversarial examples.
|
Attacks against machine learning — an overview
This blog post survey the attacks techniques that target AI (artificial intelligence) systems and how to protect against them.
|
|
|
Many opportunities for discrimination in deploying machine learning systems
Hal Daumé III shows us how “discrimination” might come into a system by walking through the different stages of an arXiv paper recommendation system.
|
Awesome interpretable machine learning
A list of resources facilitating model interpretability (introspection, simplification, visualization, explanation).
|
Bias detectives: the researchers striving to make algorithms fair
As machine learning infiltrates society, scientists are trying to help ward off injustice. This Nature feature gives an overview of the researchers working on fairness in ML.
|
|
|
Train a Reinforcement Learning agent to play custom levels of Sonic the Hedgehog with Transfer Learning
Felix Yu walks us through his 5th place solution to the OpenAI retro contest on training a RL agent to play custom levels of Sonic the Hedghehog with Transfer Learning. It’s a great post that gives an honest impression of what thinks did and didn’t work.
|
Metacar
Metacar is a a reinforcement learning environment for self-driving cars in the browser. The project contains examples of algorithms created with metacar.
|
|
|
Facebook open sources DensePose
Facebook AI Research open sources DensePose, a real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body.
|
Neural scene representation and rendering
DeepMind introduces the Generative Query Network (GQN), a framework within which machines learn to perceive their surroundings by training only on data obtained by themselves as they move around scenes.
|
|
|
How Can Neural Network Similarity Help Us Understand Training and Generalization?
A blog post about using neural network similarity as measured by canonical correlation analysis (CCA) to better understand the generalization behaviour of models.
|
Twitter meets TensorFlow
A blog post on the process and rationale behind Twitter Cortex migrating its deep learning framework from Lua Torch to TensorFlow.
|
|
Deep-learning-free Text and Sentence Embedding, Part 1
|
Deep Learning: Theory & Practice
Yoel Zeldes summarizes the talks given in the Deep Learning: Theory & Practice conference held in Israel (with guest speakers from abroad). He describes some of the key points that were particularly interesting.
|
Suicide prevention: how scientists are using artificial intelligence to help people at risk
The Crisis Text Line uses machine learning to figure out who’s at risk and when to intervene. If you’re interested in mental health, some of the data is available here. Did you know that Wednesday is the most anxiety-provoking day of the week?
|
🚀 100 Times Faster Natural Language Processing in Python
Thomas Wolf how to take advantage of spaCy & a bit of Cython for blazing fast NLP.
|
|
|
Improving Language Understanding
by Generative Pre-Training
This paper by OpenAI is in the same line as recent approaches such as ELMo and ULMFiT. Compared to those, the proposed approach uses a more expressive encoder (a Transformer) and simple task-specific transformations (e.g. concatenating premise and hypothesis for entailment). It achieves state-of-the-art results across a diverse range of tasks. A cool aspect is that the model can even perform a form of zero-shot learning using heuristics.
|
The Natural Language Decathlon: Multitask Learning as Question Answering
Researchers from Salesforce the Natural Language Decathlon, a challenge that spans ten diverse NLP tasks, from QA and summarization to relation extraction and commonsense pronoun resolution. They frame all tasks as QA and propose a new question answering network that jointly learns all tasks.
|
GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations
A collaboration between researchers from CMU, NYU, and FAIR. Instead of using features for transfer learning, the authors seek to learn transferrable graphs. The graphs look similar to attention matrices and are multiplied by task-specific features during fine-tuning. They show improvements across some tasks, but the baselines are somewhat weak.
|
|
Did you enjoy this issue?
|
|
|
|
If you don't want these updates anymore, please unsubscribe here
If you were forwarded this newsletter and you like it, you can subscribe here
|
|
|
|
|