Adaptive Multi-Modal Driver Drowsiness Detection Using Temporal Convolutional Networks

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

Adaptive Multi-Modal Driver Drowsiness Detection Using Temporal Convolutional Networks

Version

File Size 881.90 KB

Downloads 2

Files 1

Published 13 June 2026

Updated 13 June 2026

Adaptive Multi-Modal Driver Drowsiness Detection Using Temporal Convolutional Networks

Authors:

A.Karunamurthy, Associate Professor, Dept of CSE, Sri Manakula Vinayagar Engineering College (SMVEC), Puducherry,India. karunamurthy26@gmail.com (corresponding Author)

P.Gurumoorthy, PG Student, Dept of MCA, Sri Manakula Vinayagar Engineering College (SMVEC), Puducherry,India. gurumoorthy0809@gmail.com

Abstract

Driver drowsiness remains a leading cause of road accidents, yet existing detection systems often rely on rigid thresholds applied to isolated physiological signals, resulting in high false positive rates and poor generalization across individuals. We propose a multi-modal adaptive framework that integrates facial landmark detection, eye and mouth aspect ratios, and head pose estimation into a unified temporal model for real-time fatigue scoring. The system first extracts key facial points using a MobileNetV3 backbone with a coordinate regression head, from which geometric ratios such as the Eye Aspect Ratio and Mouth Aspect Ratio are computed. Head pose angles are simultaneously estimated by solving the Perspective-n-Point problem. Instead of applying fixed thresholds to these individual metrics, we feed the continuous feature sequence—including blink frequency derived from the temporal derivative of the Eye Aspect Ratio—into a Temporal Convolutional Network (TCN) with dilated causal convolutions. The TCN captures long-range temporal dependencies that distinguish transient distractions from genuine drowsiness patterns. A fully connected layer with sigmoid activation then outputs a scalar fatigue score between zero and one, representing the probability of a drowsy state. This adaptive score dynamically replaces conventional binary alert logic, allowing the system to respond to complex, multi-variate fatigue signatures such as the simultaneous occurrence of yawning and head nodding. The proposed method reduces false alarms caused by natural facial expressions or environmental factors. Furthermore, the end-to-end learning framework eliminates the need for manual threshold tuning. Experimental results demonstrate that the temporal integration of multiple modalities significantly improves detection accuracy compared to single-metric approaches. The framework is computationally efficient and suitable for real-time deployment in embedded driver monitoring systems.

Keywords Driver drowsiness detection · Temporal convolutional network · Facial landmark detection · Eye aspect ratio · Mouth aspect ratio · Head pose estimation · Multimodal learning · Real-time driver monitoring