Recent Advances in Multimodal Interfaces for Enhanced Human–Computer Interaction
- Version
- Download 14
- File Size 349.62 KB
- File Count 1
- Create Date 3 March 2026
- Last Updated 3 March 2026
Recent Advances in Multimodal Interfaces for Enhanced Human–Computer Interaction
AUTHOR : SREESHYLAM RASULA MCA , TG SET.
Faculty Of Computer Science Government Degree College, Ibrahimpatnam,
Hyderabad,Telangana, India
Email : Sree.Rasula.Siddu@Gmail.Com
Abstract
Multimodal interfaces combine two or more input and/or output channels such as text, speech, vision, touch, gaze, gesture, haptics, and physiological sensing to create interaction styles that are more natural, accessible, and context-aware than single-modality systems. In the last few years, progress in multimodal machine learning, foundation models, wearable sensing, and spatial computing has reshaped how humans communicate intent to machines and how machines returnfeedback in real time. This article reviews recent advances in multimodal human–computer interaction (HCI) with an emphasis on: (i) multimodal fusion and alignment methods, (ii) multimodal large language models that connect language with perception, (iii) emerging sensing modalities (e.g., wrist sEMG) and XR interaction patterns (e.g., gaze + pinch), and (iv) evaluation practices that capture accuracy, latency, cognitive load, and user trust. A research methodology is presented for building and assessing multimodal interfaces, including dataset selection, signal preprocessing, fusion design, and user study protocols. Tables summarize modalities, fusion strategies, benchmark datasets, and evaluation metrics. Mathematical formulations cover fusion operators, attention-based alignment, and HCI performance laws. Finally, the paper consolidates findings, practical suggestions, and a forward-looking agenda addressing robustness under distribution shift, privacy, safety, and inclusive design
Download