Volume1, Issue2

Aims: This scoping review systematically maps the existing literature at the intersection of virtual reality (VR), synthetic speech, and affective computing. As immersive and voice-based technologies gain traction in education, mental health, ... More.

Mateusz Dubiel, Jean Botev

DOI：https://doi.org/10.70401/ec.2025.0011 - September 29, 2025

Empathic Extended Reality in the Era of Generative AI

Aims: Extended reality (XR) has been widely recognized for its ability to evoke empathetic responses by immersing users in virtual scenarios and promoting perspective-taking. However, to fully realize the empathic potential of XR, it is necessary ... More.

Aims: Extended reality (XR) has been widely recognized for its ability to evoke empathetic responses by immersing users in virtual scenarios and promoting perspective-taking. However, to fully realize the empathic potential of XR, it is necessary to move beyond the concept of XR as a unidirectional “empathy machine.” This study proposes a bidirectional “empathy-enabled XR” framework, wherein XR systems not only elicit empathy but also demonstrate empathetic behaviors by sensing, interpreting, and adapting to users’ affective and cognitive states.

Methods: Two complementary frameworks are introduced. The first, the Empathic Large Language Model (EmLLM) framework, integrates multimodal user sensing (e.g., voice, facial expressions, physiological signals, and behavior) with large language models (LLMs) to enable bidirectional empathic communication. The second, the Matrix framework, leverages multimodal user and environmental inputs alongside multimodal LLMs to generate context-aware 3D objects within XR environments. This study presents the design and evaluation of two prototypes based on these frameworks: a physiology-driven EmLLM chatbot for stress management, and a Matrix-based mixed reality (MR) application that dynamically generates everyday 3D objects.

Results: The EmLLM-based chatbot achieved 85% accuracy in stress detection, with participants reporting strong therapeutic alliance scores. In the Matrix framework, the use of a pre-generated 3D model repository significantly reduced graphics processing unit utilization and improved system responsiveness, enabling real-time scene augmentation on resource-constrained XR devices.

Conclusion: By integrating EmLLM and Matrix, this research establishes a foundation for empathy-enabled XR systems that dynamically adapt to users’ needs, affective and cognitive states, and situational contexts through real-time 3D content generation. The findings demonstrate the potential of such systems in diverse applications, including mental health support and collaborative training, thereby opening new avenues for immersive, human-centered XR experiences.

Less.

Poorvesh Dongre, ... Denis Gračanin

DOI：https://doi.org/10.70401/ec.2025.0009 - June 29, 2025

Multimodal emotion recognition with disentangled representations: private-shared multimodal variational autoencoder and long short-term memory framework

Aims: This study proposes a multimodal emotion recognition framework that combines a private-shared disentangled multimodal variational autoencoder (DMMVAE) with a long short-term memory (LSTM) network, herein referred to as DMMVAE-LSTM. The ... More.

Aims: This study proposes a multimodal emotion recognition framework that combines a private-shared disentangled multimodal variational autoencoder (DMMVAE) with a long short-term memory (LSTM) network, herein referred to as DMMVAE-LSTM. The primary objective is to improve the robustness and generalizability of emotion recognition by effectively leveraging the complementary features of electroencephalogram (EEG) signals and facial expression data.

Methods: We first trained a variational autoencoder using a ResNet-101 architecture on a large-scale facial dataset to develop a robust and generalizable facial feature extractor. This pre-trained model was then integrated into the DMMVAE framework, together with a convolutional neural network-based encoder and decoder for EEG data. The DMMVAE model was trained to disentangle shared and modality-specific latent representations across both EEG and facial data. Following this, the outputs of the encoders were concatenated and fed into a LSTM classifier for emotion recognition.

Results: Two sets of experiments were conducted. First, we trained and evaluated our model on the full dataset, comparing its performance with state-of-the-art methods and a baseline LSTM model employing a late fusion strategy to combine EEG and facial features. Second, to assess robustness, we tested the DMMVAE-LSTM framework under data-limited and modality dropout conditions by training with partial data and simulating missing modalities. The results demonstrate that the DMMVAE-LSTM framework consistently outperforms the baseline, especially in scenarios with limited data, indicating its capacity to learn structured and resilient latent representations.

Conclusion: Our findings underscore the benefits of multimodal generative modeling for emotion recognition, particularly in enhancing classification performance when training data are scarce or partially missing. By effectively learning both shared and private representations, DMMVAE-LSTM framework facilitates more reliable emotion classification and presents a promising solution for real-world applications where acquiring large labeled datasets is challenging.

Less.

Behzad Mahaseni, Naimul Mefraz Khan

DOI：https://doi.org/10.70401/ec.2025.0010 - June 29, 2025

Integrating colored lights into multimodal robotic storytelling

Aims: Storytelling has evolved alongside human culture, giving rise to new media such as social robots. While these robots employ modalities similar to those used by humans, they can also utilize non-biomimetic modalities, such as color, which are ... More.

Sophia C. Steinhaeusser, ... Birgit Lugrin

DOI：https://doi.org/10.70401/ec.2025.0008 - May 10, 2025

Investigating the ‘I’ in team: development and evaluation of an individual-level IMO model for augmented reality-mediated remote collaboration

Aims: This study aims to enhance the design of augmented reality (AR) technologies for remote collaboration by examining the complex relationships among individual factors (user characteristics), psychological and physiological states during ... More.

Lisa Thomaschewski, ... Annette Kluge

DOI：https://doi.org/10.70401/ec.2025.0007 - April 16, 2025

Empathic Computing

Volume 1, Issue 2

Table Of Contents (5 Articles)

Synthetic speech and affective experience in virtual reality: A scoping review

Empathic Extended Reality in the Era of Generative AI

Multimodal emotion recognition with disentangled representations: private-shared multimodal variational autoencoder and long short-term memory framework

Integrating colored lights into multimodal robotic storytelling

Investigating the ‘I’ in team: development and evaluation of an individual-level IMO model for augmented reality-mediated remote collaboration

Empathic Computing

Navigation

Follow us