There are many excellent newsletters out there related to ML (shout-outs in particular to Nathan Benaich’s, Jack Clark’s, and Denny Britz’s excellent newsletters). Natural Language Processing (NLP) is seeing increasing interest recently, but there is no resource available that is dedicated to condensing NLP-related information – besides the occasional Twitter conversation and your daily arXiv cs.CL digest (a quick Google search turned up that the most relevant newsletters pertain to the other NLP, ugh).
This is an experiment to gauge if there is demand for such a newsletter for NLP. Please let me know which parts you like and dislike and what you are missing.
Top NLP Resources for Beginners
It can feel daunting to try to get into NLP. Here is a list of some of the most helpful resources out there that will kick-start your learning:
Modeling dialogue is tricky. Dialogue agents are expected to strike the balance between being able to communicate on a diverse range of topics, providing information, and accomplishing tasks in a wide range of environments.
Yun-Nung (Vivian) Chen et al. provide a great tutorial of the state-of-the-art in dialogue research and – in particular – highlight the difference between chit-chat dialogue systems and task-oriented dialogue agents.
Ryan Lowe gives an excellent overview of the problems that plague state-of-the-art neural dialogue systems: 1. Data; 2. model architecture; 3. evaluation; 4. the premise itself that learning from static datasets will allow us to learn the function of language and to ground it in observations.
The 2017 edition of the Google Scholar Metrics rankings of NLP conferences has just been released. arXiv cs.CL tops the charts for the first time, followed by ACL and EMNLP. As Carlos shows, this changes if we normalize: Then CL and TACL take the top ranks, followed by ACL.
Tensorflow and keras are leading the pack of deep learning toolkits, but NLP-focused DyNet sneaks into the charts. DyNet has been developed by CMU among others and is particularly useful for dynamic graphs.
Computer vision has seen increasing interest in zero-shot transfer learning recently. As our training sets are finite, generalizing to unseen events, relations, entities, etc. is key. Huang et al. frame event extraction as grounding rather than classification, which allows them to generalize to new events. Related: Levy et al. (CoNLL, 2017) who generalize to unseen relations by framing relation extracting as reading comprehension.
One of my highlights of last year’s EMNLP were the many cool natural language generation applications. This year appears to be no different. Gangal et al. propose a noisy-channel character-level seq2seq model to generate portmanteaus, e.g. smog (smoke + fog) or Brexit (Britain + exit). Extra: New dataset of 1624 portmanteaus to play with.
Strong baselines are one of the most important prerequisites for conducting reliable research. Many new methods for NMT, however only compare against vanilla implementations. Denkowski & Neubig propose three baselines that are easy to implement and yield significant gains over regular baselines: 1. using Adam with multiple restarts and learning rate annealing; 2. sub-word translation via bye pair encoding; 3. decoding with ensembles of independently trained models.
Named Entity Recognition (NER) systems are very good at predicting frequent entities. However, in social media or newswire, new entities are very common. This dataset of the shared task of the 3rd Workshop on Noisy User-generated Text (W-NUT) at EMNLP 2017 focuses on exactly this challenging scenario of predicting emerging and rare entities.