Block course
Time & Location: kick-off meeting in April-May; presentation meetings indicatively 2-3 last weeks of September or 2-3 first weeks of October
Teacher: Dr Volha Petukhova
*** Announcements***
Registration in LSF
Join TEAMS
Kick-off: TBA
Kick-off & Introduction slides: see TEAMS Class Material
Suitable for: CoLi, CS and CuK
Organization:
We plan to hold a first planing meeting early in the semester. For the actual seminar (doodle decision on time and papers) we will have a talk for each participant of 30 minutes followed by 10 minutes discussion (discussions participation will be also graded) . After the talk, the presenter has to prepare a short about 10 pages report and hand it in for grading.
Grading: 40% based on the talk, 40% based on the report, 20% based on discussions participation.
Term paper:
Topics:
Understanding and generation of multimodal human dialogue behavior;
Social signals/affective computing;
Multimodal dialogue modelling;
Large Language Models for Dialogue Modelling and Analysis
Multimodal dialogue systems & applications
*Each talk will be based on a research paper
Multimodality: multimodal behaviour, tracking devices, annotations and tools
- A. Withana, D. Groeger, and J. Steimle. Tacttoo: A Thin and Feel-Through Tattoo for On-Skin Tactile Output. In: Proc. of the 31st Annual ACM Symp. on User Interface Software and Technology. UIST ’18.
ACM, 2018, pp. 365–378.
2. Wang, H., Mendiratta, M., Theobalt, C., & Kortylewski, A. (2024). FaceGPT: Self-supervised learning to chat about 3d human faces. arXiv preprint arXiv:2406.07163.
3. Jiang, B., Chen, X., Liu, W., Yu, J., Yu, G., & Chen, T. (2023). MotionGPT: Human motion as a foreign language. Advances in Neural Information Processing Systems, 36, 20067-20079.
4. Oppenlaender, J., Johnston, H., Silvennoinen, J. M., & Barranha, H. (2025). Artworks reimagined: Exploring human-AI co-creation through body prompting. Proceedings of the ACM on Human-Computer Interaction, 9(4), 1-34.
5. M. Rekrut, A. M. Selim, and A. Krüger. Improving Silent Speech BCI Training Procedures Through Transfer from Overt to Silent Speech. In: 2022 IEEE Int. Conf. on Systems, Man, and Cybernetics (SMC).
2022, pp. 2650–2656. d o ı: 10.1109/SMC53654.2022.9945447.
Emotions, social signals and interaction
6. Joby, N. E., & Umemuro, H. (2023, September). Emotional mimicry as a proxy measurement for pro-social indicators of trust, empathy, liking and altruism. In Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents
7. Marquez Herbuela, V. R. D., & Nagai, Y. (2025, October). Realtime Multimodal Emotion Estimation using Behavioral and Neurophysiological Data. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 785-787).
8. Welivita, A., Yeh, C. H., & Pu, P. (2023, September). Empathetic response generation for distress support. In Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue (pp. 632-644).
9. Li, Z., Kangas, J., Farooq, A., & Raisamo, R. (2025, October). Exploring the effects of force feedback on VR Keyboards with varying visual designs. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 106-115).
10. Buker, A., Smith, E., Perepelkina, O., & Vinciarelli, A. (2025, October). Multimodal Analysis of Disagreement in Dyadic Conversations: An Approach Based on Emotion Recognition. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 228-237).
11. Santana, R., Irfan, B., Lagerstedt, E., Skantze, G., & Pereira, A. (2025, October). Speech-to-Joy: Self-Supervised Features for Enjoyment Prediction in Human–Robot Conversation. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 238-248).
Multimodal fusion, dialogue modelling and management
12. Chen, S. (2025, October). What makes you say yes? An investigation of mental state and personality in persuasion during a dyadic conversation. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 16-24).
13. Gryshchuk, V., Maistro, M., Lioma, C., & Ruotsalo, T. (2025, October). Decoding Affective States without Labels: Bimodal Image-brain Supervision. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 25-34).
14. Zhang, H., Marquez Herbuela, V. R. D., & Nagai, Y. (2025, October). Foundation Feature-Guided Hierarchical Fusion of EEG-Physiological for Emotion Estimation. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 44-50).. arXiv preprint arXiv:2309.10015.
15. Coca, A., Tseng, B. H., Chen, J., Lin, W., Zhang, W., Anders, T., & Byrne, B. (2023). Grounding Description-Driven Dialogue State Trackers with Knowledge-Seeking Turns. arXiv preprint arXiv:2309.13448.
16. Ramirez, A., Agarwal, K., Juraska, J., Garg, U., & Walker, M. A. (2023). Controllable Generation of Dialogue Acts for Dialogue Systems via Few-Shot Response Generation and Ranking. arXiv preprint arXiv:2307.14440.
Large Language Models for Dialogue Modelling and Analysis
17. Finch, S. E., Paek, E. S., & Choi, J. D. (2023). Leveraging large language models for automated dialogue analysis. arXiv preprint arXiv:2309.06490.
18. Addlesee, A., Sieińska, W., Gunson, N., Garcia, D. H., Dondrup, C., & Lemon, O. (2023). Multi-party goal tracking with LLMs: Comparing pre-training, fine-tuning, and prompt engineering. arXiv preprint arXiv:2308.15231.
19. Ostyakova, L., Smilga, V., Petukhova, K., Molchanova, M., & Kornev, D. (2023, September). ChatGPT vs. Crowdsourcing vs. Experts: Annotating Open-Domain Conversations with Speech Functions. In Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue (pp. 242-254).
Multimodal dialogue systems & applications
20. Alsarrani, R., Esposito, A., & Vinciarelli, A. (2025, October). Punctual or Continuous? Analyzing Depression Traces in Language and Paralanguage with Multiple Instance Learning. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 614-623).
21. Marcoux, A., Tessier, M. H., & Jackson, P. L. (2023, September). Nonverbal Markers of Empathy in Virtual Healthcare Professionals. In Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents (pp. 1-4).
22. Valerio, R., & Mahmoud, M. (2025, October). A multimodal Framework for exploring behavioural cues for automatic Stress Detection. In Proceedings of the 27th International Conference on Multimodal Interaction (pp. 535-539).
23. Garcia, J. C., Suglia, A., Eshghi, A., & Hastie, H. (2023, July). ‘What are you referring to?’Evaluating the ability of multi-modal dialogue models to process clarificational exchanges. In 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp. 1-8). Association for Computational Linguistics.
24. Shoa, A., Oliva, R., Slater, M., & Friedman, D. (2023, September). Sushi with Einstein: Enhancing Hybrid Live Events with LLM-Based Virtual Humans. In Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents (pp. 1-6).
For any questions, please send an email to:
v.petukhova@lsv.uni-saarland.de
Use subject tag: [MDS_2026]