Deep learning is the predominant machine learning paradigm in natural language processing (NLP). This approach not only gave huge performance improvements across a large variety of natural language processing tasks.
This spring, we will look into transformer language models and how they can integrate graph information.
Lecturer: Dietrich Klakow
Location: t.d.b
Time: block course in the spring break 2026 however preparations start earlier. Here the specific time line
Closing topic doodle: tbd
Kick-Off: tbd
One page outline: tbd
Draft presentation: tbd
Practice talks and final talks will be during the spring break. Time/date will be decided during the kick-off.
Application for participation: see CS seminar system (for CS; DSAI, VC, ES, …), for CoLI,LST, LCT use LSF to apply.
HISPOS registration deadline: tbd
Grading (tentative):
- 5% one page talk outline
- 10% draft presentation
- 10% practice talk
- 25% own experiments and coding
- 10% report on coding task
- 35% final talk
- 5% contributions to discussion during final talk of fellow participants
List of Topics (tentative):
- Overtrained Language Models Are Harder to Fine-Tune
- LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently
- Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
- LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models
- The dark side of the forces: assessing non-conservative force models for atomistic machine learning
- CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models
- Scaling Laws for Upcycling Mixture-of-Experts Language Models#
- Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes
- Recipe for a General, Powerful, Scalable Graph Transformer