NLP News - NLP for beginners, dialogue & sentence representations
There are many excellent newsletters out there related to ML (shout-outs in particular to Nathan Benaich's, Jack Clark's, and Denny Britz's excellent newsletters). Natural Language Processing (NLP) is seeing increasing interest recently, but there is no resource available that is dedicated to condensing NLP-related information -- besides the occasional Twitter conversation and your daily arXiv cs.CL digest (a quick Google search turned up that the most relevant newsletters pertain to the other NLP, ugh).
This is an experiment to gauge if there is demand for such a newsletter for NLP. Please let me know which parts you like and dislike and what you are missing.
Top NLP Resources for Beginners
It can feel daunting to try to get into NLP. Here is a list of some of the most helpful resources out there that will kick-start your learning:
Yoav Goldberg's Primer on Neural Network Models for Natural Language Processing, which provides an excellent survey of neural network methods for NLP.
Stanford CS224n: Deep Learning for Natural Language Processing, arguably the best online course to learn about state-of-the-art methods for natural language processing.
NLP in-depth: Dialogue
Modeling dialogue is tricky. Dialogue agents are expected to strike the balance between being able to communicate on a diverse range of topics, providing information, and accomplishing tasks in a wide range of environments.
Yun-Nung (Vivian) Chen et al. provide a great tutorial of the state-of-the-art in dialogue research and -- in particular -- highlight the difference between chit-chat dialogue systems and task-oriented dialogue agents.
Ryan Lowe gives an excellent overview of the problems that plague state-of-the-art neural dialogue systems: 1. Data; 2. model architecture; 3. evaluation; 4. the premise itself that learning from static datasets will allow us to learn the function of language and to ground it in observations.
News from recent or upcoming conferences.
Details of all ACL papers are out. Information about the sessions can be found here. Two picks: theory behind "man" + "royal" = "king", bilingual representations with (almost) no parallel data.
EMNLP author notification has been sent out. A list of accepted papers is not yet available, but some have already made it to arXiv (see below).
Focus on computational and mathematical approaches in linguistics with an all-star invited speakers and organizers panel. Papers (8pp) and abstracts (2pp). Deadline is August 1.
Abstracts are due on July 31.
The 2017 edition of the Google Scholar Metrics rankings of NLP conferences has just been released. arXiv cs.CL tops the charts for the first time, followed by ACL and EMNLP. As Carlos shows, this changes if we normalize: Then CL and TACL take the top ranks, followed by ACL.
Baidu makes an entrance into the chatbot market by acquiring Seattle-based startup Kitt.ai, which provides chatbot and natural language understanding (NLU) services across devices.
Tensorflow and keras are leading the pack of deep learning toolkits, but NLP-focused DyNet sneaks into the charts. DyNet has been developed by CMU among others and is particularly useful for dynamic graphs.
Textio helps companies improve the language of their postings in order to attract a more qualified and diverse set of candidates.
Madrid-based edtech startup Lingokids offers language lessons as interactive games in English and simplified Chinese for children aged 2-6.
Google is awarding the Press Association and Urbs Media $805k to build software to automate the writing of 30k local stories a month.
Some of the most intriguing recent research articles.
FB researchers show that we can use the Stanford Natural Language Inference (SNLI) dataset to learn very good sentence representations. Related: Wieting & Gimpel (ACL, 2017) learn sentence representations from a large paraphrase database; Jernite et al. (2017) introduce new unsupervised objectives for learning sentence representations. What other tasks are helpful for inducing sentence representations?
Computer vision has seen increasing interest in zero-shot transfer learning recently. As our training sets are finite, generalizing to unseen events, relations, entities, etc. is key. Huang et al. frame event extraction as grounding rather than classification, which allows them to generalize to new events. Related: Levy et al. (CoNLL, 2017) who generalize to unseen relations by framing relation extracting as reading comprehension.
One of my highlights of last year's EMNLP were the many cool natural language generation applications. This year appears to be no different. Gangal et al. propose a noisy-channel character-level seq2seq model to generate portmanteaus, e.g. smog (smoke + fog) or Brexit (Britain + exit). Extra: New dataset of 1624 portmanteaus to play with.
Researchers from DeepMind propose an agent that learns to perform natural language commands (think: "pick the red object/hat/zebra next to the green object") in a simulated environment. The key to learning are unsupervised auxiliary frame and language prediction objectives. Related: a gated-attention model from CMU; other related research from OpenAI, Lazaridou et al. (ICLR, 2017), and others starts from multi-agent dialog and shows that natural language may or may not naturally develop.
Strong baselines are one of the most important prerequisites for conducting reliable research. Many new methods for NMT, however only compare against vanilla implementations. Denkowski & Neubig propose three baselines that are easy to implement and yield significant gains over regular baselines: 1. using Adam with multiple restarts and learning rate annealing; 2. sub-word translation via bye pair encoding; 3. decoding with ensembles of independently trained models.
In this section, I will introduce one new/exciting dataset.
Named Entity Recognition (NER) systems are very good at predicting frequent entities. However, in social media or newswire, new entities are very common. This dataset of the shared task of the 3rd Workshop on Noisy User-generated Text (W-NUT) at EMNLP 2017 focuses on exactly this challenging scenario of predicting emerging and rare entities.
Finally, I will highlight one informative Twitter conversation (powered by Treeverse) or inspiring tweet.
A lively Twitter discussion about the benefits and trade-offs of pursuing a PhD at a top US vs. top European institution. Related: question on Quora.