Recent Advances in Multimodal Interfaces for Enhanced Human–Computer Interaction

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

Recent Advances in Multimodal Interfaces for Enhanced Human–Computer Interaction

Version

File Size 349.62 KB

Downloads 62

Files 1

Published 3 March 2026

Updated 3 March 2026

Recent Advances in Multimodal Interfaces for Enhanced Human–Computer Interaction

AUTHOR : SREESHYLAM RASULA MCA , TG SET.
Faculty Of Computer Science Government Degree College, Ibrahimpatnam,
Hyderabad,Telangana, India
Email : Sree.Rasula.Siddu@Gmail.Com

Abstract
Multimodal interfaces combine two or more input and/or output channels such as text, speech, vision, touch, gaze, gesture, haptics, and physiological sensing to create interaction styles that are more natural, accessible, and context-aware than single-modality systems. In the last few years, progress in multimodal machine learning, foundation models, wearable sensing, and spatial computing has reshaped how humans communicate intent to machines and how machines returnfeedback in real time. This article reviews recent advances in multimodal human–computer interaction (HCI) with an emphasis on: (i) multimodal fusion and alignment methods, (ii) multimodal large language models that connect language with perception, (iii) emerging sensing modalities (e.g., wrist sEMG) and XR interaction patterns (e.g., gaze + pinch), and (iv) evaluation practices that capture accuracy, latency, cognitive load, and user trust. A research methodology is presented for building and assessing multimodal interfaces, including dataset selection, signal preprocessing, fusion design, and user study protocols. Tables summarize modalities, fusion strategies, benchmark datasets, and evaluation metrics. Mathematical formulations cover fusion operators, attention-based alignment, and HCI performance laws. Finally, the paper consolidates findings, practical suggestions, and a forward-looking agenda addressing robustness under distribution shift, privacy, safety, and inclusive design