top of page


Multimodal Speech Technology Architecture

With the rise of chatbots and voice interaction, conversations with machines become more human-like. But when people communicate, in addition to their voice they use everything they have, including gestures, emotions and facial expressions. The support of such multimodal communication will be an important requirement for human-computer interaction in the future. In this talk we outline concepts and a basic architecture of multimodal systems. We illustrate advantages of multimodal interfaces through several technological demonstrators, which have been developed in research and industrial projects of the DFKI.

Dec 12


20 mins


Stefan Schaffer

Stefan Schaffer

Senior Researcher, DFKI GmbH

Copy of BRN19 Logos for company page- 45

Stefan Schaffer is a Senior Researcher and Head of the Human-AI Interaction group at the German Research Center for Artificial Intelligence (DFKI) Cognitive Assistants department.
As project leader of numerous industrial and applied research projects, he realised human-AI interfaces utilising conversational, multisensory, and implicit interaction for domains such as mobility, automotive, tax information, customer service, etc.
Currently, he is working on AI chatbots for value chains and hybrid events. Before joining DFKI, Stefan worked as a product manager at Linon Medien. He studied communication science and computer science and did his doctorate at the Technical University of Berlin in the field of multimodal human-computer interaction.


bottom of page