Learning from Limited Data in NLP (Seminar, WeSe 2022/23)

Time & Location: Mondays 14:00 – 16:00. B3.1, Seminarraum 1.15.
First session: 07.11.2022
Teachers: Marius Mosbach, Dawei Zhu
Suitable for: Master CS, LST, and related
Places: 18
Credit Points (CP): 7CP for CS students. 4CP or 7CP for CoLi students.
Registration: Link

Description
While deep learning based Natural Language Processing (NLP) has achieved great progress during the past decade, one of the major bottlenecks in training deep neural networks (DNNs) is the requirement of substantial amounts of labeled training data. This can make deploying NLP models to real-world applications challenging, as data creation can be costly, time-consuming and/or labor-intensive.
In the last years, there has been an increased interest in building NLP models that are less data-demanding, and significant progress has been made there. Recent advances in this field enable efficient learning from just a handful of samples (few-shot learning). In addition, large-scale pre-trained language models can often achieve non-trivial performance on unseen NLP tasks (zero-shot learning).
This seminar aims to provide a board and up-to-date overview of recent progress on zero- and few-shot learning in NLP. In particular, we will study recent papers to understand the challenges of learning from limited data and how to leverage pre-trained language models to make efficient learning in low-resource settings possible.

Attendance
This seminar requires the students to attend in person. But we may switch to an online seminar depending on the future Covid situation.

Requirements and Grading

  • During this seminar, students are required to read papers about recent advances in Natural Language Processing (NLP).
  • Each student will be assigned to present one paper. Two papers will be presented in each session.
  • Besides the presentations, CS students are required to write a report about assigned papers
  • CoLi students can decide whether to write a report or not. The seminar offers 4CP for presentation only, and 7CP for presentation + report.

List of Papers
The full list of papers will be announced in the first session on 07.11.2022.
Examples of papers (to provide a rough idea of the papers we are going to cover):

  • Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, & Sylvain Gelly (2019). Parameter-Efficient Transfer Learning for NLP. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA (pp. 2790–2799). PMLR. (Link)
  • Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, & Sameer Singh (2021). Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event (pp. 12697–12706). PMLR. (Link)
  • Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, & Luke Zettlemoyer (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. CoRR, abs/2202.12837. (Link)

All Sessions

SessionPapersSpeakersSession Chair
07.11.2022 (Kick-off Meeting)NANANA
28.11.20221. Calibrate Before Use: Improving Few-Shot Performance of Language Models
2. Noisy Channel Language Model Prompting for Few-Shot Text Classification
1. Addluri
2. Al Khalili
1. Junaid
2. Kavu Maithri Rao
05.12.20221. The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning
2. Can language models learn from explanations in context?
1. Paramalla
2. Basvoju
1. Addluri
2. Al Khalili
12.12.2022 (Break)NANANA
19.12.20221. Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations
2. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
1. Del Valle Giron
2. Ivanova
1. Paramalla
2. Basvoju
09.01.20231. Chain of Thought Prompting Elicits Reasoning in Large Language Models
2. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
1. Akhundjanova
2. Zamani
1. Del Valle Giron
2. Ivanova
16.01.20231. Prefix-Tuning: Optimizing Continuous Prompts for Generation
2. SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
1. Chen
2. Krishnan
1. Akhundjanova
2. Zamani
23.01.20231. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
2. Making Pre-trained Language Models Better Few-shot Learners
1. Behura
2. Sam
1. Chen
2. Krishnan
30.01.20231. True Few-Shot Learning with Language Models
2. FewNLU: Benchmarking State-of-the-Art Methods for Few-Shot Natural Language Understanding
1. Junaid
2. Kavu Maithri Rao
1. Behura
2. Sam
Temporary timeline. Some time slots may be canceled/shifted.