Time & Location: Mondays 14:00 – 16:00. B3.1, Seminarraum 1.15.
First session: 07.11.2022
Teachers: Marius Mosbach, Dawei Zhu
Suitable for: Master CS, LST, and related
Credit Points (CP): 7CP for CS students. 4CP or 7CP for CoLi students.
While deep learning based Natural Language Processing (NLP) has achieved great progress during the past decade, one of the major bottlenecks in training deep neural networks (DNNs) is the requirement of substantial amounts of labeled training data. This can make deploying NLP models to real-world applications challenging, as data creation can be costly, time-consuming and/or labor-intensive.
In the last years, there has been an increased interest in building NLP models that are less data-demanding, and significant progress has been made there. Recent advances in this field enable efficient learning from just a handful of samples (few-shot learning). In addition, large-scale pre-trained language models can often achieve non-trivial performance on unseen NLP tasks (zero-shot learning).
This seminar aims to provide a board and up-to-date overview of recent progress on zero- and few-shot learning in NLP. In particular, we will study recent papers to understand the challenges of learning from limited data and how to leverage pre-trained language models to make efficient learning in low-resource settings possible.
This seminar requires the students to attend in person. But we may switch to an online seminar depending on the future Covid situation.
Requirements and Grading
- During this seminar, students are required to read papers about recent advances in Natural Language Processing (NLP).
- Each student will be assigned to present one paper. Two papers will be presented in each session.
- Besides the presentations, CS students are required to write a report about assigned papers
- CoLi students can decide whether to write a report or not. The seminar offers 4CP for presentation only, and 7CP for presentation + report.
List of Papers
The full list of papers will be announced in the first session on 07.11.2022.
Examples of papers (to provide a rough idea of the papers we are going to cover):
- Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, & Sylvain Gelly (2019). Parameter-Efficient Transfer Learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA (pp. 2790–2799). PMLR. (Link)
- Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, & Sameer Singh (2021). Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event (pp. 12697–12706). PMLR. (Link)
- Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, & Luke Zettlemoyer (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. CoRR, abs/2202.12837. (Link)