Multimodal Dialogue Systems (Fall 2023) – Spoken Language Systems

Block course

Time & Location: kick-off meeting in April-May; presentation meetings indicatively 2-3 last weeks of September or 2-3 first weeks of October

Teacher: Dr Volha Petukhova

*** Announcements***

Registration in LSF CLOSED

Join TEAMS

Kick-off: TBA

Kick-off & Introduction slides: see TEAMS Class Material

Suitable for: CoLi, CS and CuK

Organization:

We plan to hold a first planing meeting early in the semester. For the actual seminar (doodle decision on time and papers) we will have a talk for each participant of 30 minutes followed by 10 minutes discussion (discussions participation will be also graded) . After the talk, the presenter has to prepare a short about 10 pages report and hand it in for grading.

Grading: 40% based on the talk, 40% based on the report, 20% based on discussions participation.

Term paper:

LaTeX template for term papers (zip)
11-point checklist for term papers (pdf)

Topics:

Situated interaction;
Understanding and generation of multimodal human dialogue behavior;
Social signals/affective computing;
Multimodal dialogue modelling;
Multimodal dialogue systems & applications

*Each talk will be based on a research paper

Cognition: cognitive states, affective states and cognitive agents

1. Zeng, Zhihong, Maja Pantic, Glenn I. Roisman, and Thomas S. Huang. (2007) A survey of affect recognition methods: audio, visual and spontaneous expressions. In Proceedings of the 9th international conference on Multimodal interfaces, pp. 126-133.

2. Maria Elena Lechuga Redondo, Alessandra Sciutti, Francesco Rea, & Radoslaw Niewiadomski. (2022, November). Comfortability Recognition from Visual Non-verbal Cues. International Conference on Multimodal Interaction (ICMI)

3. Sims, S. D., & Conati, C. (2020, October). A neural architecture for detecting user confusion in eye-tracking data. In Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 15-23).

4. El Kaliouby, Rana, and Peter Robinson. (2004) Mind reading machines: Automated inference of cognitive mental states from video. In 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 1, pp. 682-688. IEEE, 2004..

5. Balducci, Fabrizio, Donato Impedovo, Nicola Macchiarulo, and Giuseppe Pirlo. Affective states recognition through touch dynamics. Multimedia Tools and Applications 79, no. 47 (2020): 35909-35926.

Multimodality: multimodal behaviour, annotations and tools

6. Barange, Mukesh, Sandratra Rasendrasoa, Maël Bouabdelli, Julien Saunier, and Alexandre Pauchet. (2022, November) Impact of adaptive multimodal empathic behavior on the user interaction. In Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents.

7. Ekstedt, Erik, and Gabriel Skantze. (2022, September) How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models. In Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue.

8. Huang, J., Lin, Z., Yang, Z., & Liu, W. (2021, October). Temporal Graph Convolutional Network for Multimodal Sentiment Analysis. In Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 239-247).

9. Janet Wessler, Tanja Schneeberger, Leon Christidis, Patrick Gebhard. (2022, September). Virtual Backlash: Nonverbal expression of dominance leads to less liking of dominant female versus male agents. In ACM International Conference on Intelligent Virtual Agents (IVA ’22)

10. Amama Mahmood and Chien-Ming Huang. (2022, September). Effects of Rhetorical Strategies and Skin Tones on Agent Persuasiveness in Assisted Decision-Making. In ACM International Conference on Intelligent Virtual Agents (IVA ’22), Faro, Portugal.

11. Islam, Md Adnanul, Md Saddam Hossain Mukta, Patrick Olivier, and Md Mahbubur Rahman. (2022, September) Comprehensive guidelines for emotion annotation. In Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents.

12. Hartholt, Arno, Ed Fast, Zongjian Li, Kevin Kim, Andrew Leeds, and Sharon Mozgai. (2022, September) Re-architecting the virtual human toolkit: towards an interoperable platform for embodied conversational agent research and development. In Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents.

Multimodal fusion, dialogue modelling and management

13. Hirano, Y., Okada, S., & Komatani, K. (2021, October). Recognizing Social Signals with Weakly Supervised Multitask Learning for Multimodal Dialogue Systems. In Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 141-149).

14. Christopher Hidey, Fei Liu, and Rahul Goel. (2022, September). Reducing Model Churn: Stable Re-training of Conversational Agents. In Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 14–25, Edinburgh, UK. Association for Computational Linguistics.

15. Pecune, F., & Marsella, S. (2020, October). A framework to co-optimize task and social dialogue policies using Reinforcement Learning. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (pp. 1-8).

16. Han, W., Chen, H., Gelbukh, A., Zadeh, A., Morency, L. P., & Poria, S. (2021, October). Bi-bimodal modality fusion for correlation-controlled multimodal sentiment analysis. In Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 6-15).

17. Johnson, E., & Gratch, J. (2020, October). The Impact of Implicit Information Exchange in Human-agent Negotiations. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (pp. 1-8).

Multimodal dialogue systems & applications

18. Kawasaki, M., Yamashita, N., Lee, Y. C., & Nohara, K. (2020, October). Assessing Users’ Mental Status from their Journaling Behavior through Chatbots. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (pp. 1-8).

19. Tavabi, L., Stefanov, K., Nasihati Gilani, S., Traum, D., & Soleymani, M. (2019, October). Multimodal Learning for Identifying Opportunities for Empathetic Responses. In 2019 International Conference on Multimodal Interaction, pp. 95-104

20. Bouman, Katja, Iulia Lefter, Laurens Rook, Catherine Oertel, Catholijn Jonker, and Frances Brazier. (2022, September) The need for a female perspective in designing agent-based negotiation support. In Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents.

21. Murali, Prasanth, Farnaz Nouraei, Mina Fallah, Aisling Kearns, Keith Rebello, Teresa O’Leary, Rebecca Perkins et al. (2022, September) Training lay counselors with virtual agents to promote vaccination. In Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents.

22. Speer, S., Hamner, E., Tasota, M., Zito, L., & Byrne-Houser, S. K. (2021, October). MindfulNest: Strengthening Emotion Regulation with Tangible User Interfaces. In Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 103-111).

23. Hedeshy, Ramin, Kumar, Chandan, Lauer, Mike and Staab, Steffen (2022) All birds must fly: the experience of multimodal hands-free gaming with gaze and nonverbal voice synchronization. 24th ACM International Conference on Multimodal Interaction

For any questions, please send an email to:

v.petukhova@lsv.uni-saarland.de

Use subject tag: [MDS_2023]