Empathic Computing

Robust multimodal emotion recognition under missing and incomplete data with cross-modal regeneration

Aims: Multimodal emotion recognition (MER) can outperform unimodal approaches by integrating complementary information from multiple sources. However, real-world applications often involve incomplete or missing modalities, limiting the reliability ... More.

Aims: Multimodal emotion recognition (MER) can outperform unimodal approaches by integrating complementary information from multiple sources. However, real-world applications often involve incomplete or missing modalities, limiting the reliability of existing MER models. This study aims to develop a framework that remains robust under missing modality conditions while preserving the benefits of multimodal integration.

Methods: To address this challenge, we propose a cross-modal latent regeneration and attention long short-term memory (CMLR-ALSTM) framework. The framework integrates pretrained variational autoencoder encoders with residual projection networks trained with L2 loss to achieve stable, effective latent-space alignment across modalities. The regenerated latent features, along with the original available ones, are then integrated through a cross-modal attention mechanism and passed to an long short-term memory (LSTM) network to capture temporal dependencies and enhance multimodal fusion under incomplete data conditions.

Results: We conducted two sets of experiments. First, we evaluated the proposed CMLR-ALSTM framework on three benchmark datasets under the complete-modality scenario, where all modalities were available, and compared its performance against state-of-the-art methods and baseline models. Second, to assess model robustness, we evaluated CMLR-ALSTM under missing and partially missing modality scenarios. Experimental results on the DEAP, MAHNOB-HCI, and SEED-IV datasets demonstrate that CMLR-ALSTM achieves up to 17.22% improvement under missing-modality conditions while maintaining competitive performance with state-of-the-art methods, highlighting its ability to preserve cross-modal relationships and maintain robust latent representations.

Conclusion: The experimental results confirm the effectiveness of the proposed CMLR-ALSTM framework for MER, particularly in realistic environments where data availability is inconsistent. By leveraging CMLR to reconstruct missing representations and modelling temporal dependencies through LSTM, the proposed approach provides a more robust and reliable MER framework for practical deployment. In addition, results across different combinations of modalities indicate that the proposed framework generalized well across heterogeneous multimodal settings.

Less.

Behzad Mahaseni, Naimul Mefraz Khan

DOI：https://doi.org/10.70401/ec.2026.0023 - July 02, 2026

Effects of a robotic storytelling intervention integrating music and sound effects on prejudice toward mental illness

Aims: Mental illnesses such as anxiety disorder, obsessive-compulsive disorder and intrusive thoughts affect millions worldwide and have become increasingly visible in recent decades, particularly in young adults. Despite greater public awareness ... More.

Aims: Mental illnesses such as anxiety disorder, obsessive-compulsive disorder and intrusive thoughts affect millions worldwide and have become increasingly visible in recent decades, particularly in young adults. Despite greater public awareness and discussion, significant stigma remains, undermining the self-confidence and well-being of those affected. Anti-stigmatization interventions can help reduce prejudice by promoting education and meaningful contact between people with and without mental illness. Technology-based interventions may further support these efforts by simulating such contact through approaches such as virtual perspective-taking or storytelling. In this context, social robots, as physically embodied agents with social presence and multimodal communication capabilities, may offer additional advantages for storytelling-based interventions. Especially the integration of music and sound effects may benefit the outcomes. To this end, we evaluated a robotic storyteller as an intervention method focusing on the influence of additional sound integration.

Methods: In multimodal robotic storytelling, modalities such as voice modulation and bodily expressions have been extensively studied, while non-speech sounds, namely sound effects and background music, remain largely overlooked, despite their importance in related media such as audio books and films. To address this gap, we compared a robotic storyteller narrating a story about a person experiencing a panic attack and intrusive thoughts using only voice and bodily expression with versions integrating sound effects, background music, or both. A laboratory study examined how these modalities affected prejudice, empathy, and narrative transportation, using questionnaires and behavioral observation.

Results: The comparison yielded mixed results. While the addition of diverse combinations of non-speech sounds did not affect story recipients’ prejudice differently than a robotic storytelling without additional sounds, adding both sound effects and background music led to increased transportation into the story as well as improved associative empathy. In contrast, the addition of music only decreased associative empathy. Although mediation was not indicated, they revealed transportation as a predictor for both empathy and prejudice, while being only minimally influenced by sound integration itself.

Conclusion: The effects of music and/or sound effect integration to a robotic storytelling intervention on recipients’ prejudice were mixed, recommending either the combination of both sound types or complete omission. Furthermore, transportation was indicated as an important key lever for increasing empathy and decreasing prejudice in a robotic storytelling intervention that warrants further investigation. Thus, future work is needed to gain deeper insights into robotic storytelling as an intervention tool for reducing prejudice and stigmata, including work on increasing transportation as a modifiable factor as well as the integration of pre-post-measurements.

Less.

Sophia C. Steinhaeusser, ... Birgit Lugrin

DOI：https://doi.org/10.70401/ec.2026.0022 - June 26, 2026

Volume 2, Issue 3

Table of Contents

Robust multimodal emotion recognition under missing and incomplete data with cross-modal regeneration

Effects of a robotic storytelling intervention integrating music and sound effects on prejudice toward mental illness

Empathic Computing

Navigation

Follow us