Spatio-Temporal Deep Learning for Face Liveness Detection: A Resnet-GRU Approach

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

Spatio-Temporal Deep Learning for Face Liveness Detection: A Resnet-GRU Approach

Version

File Size 605.18 KB

Downloads 73

Files 1

Published 24 March 2026

Updated 24 March 2026

Spatio-Temporal Deep Learning for Face Liveness Detection: A Resnet-GRU Approach

Punam Chandrashekhar Wagale

Department: ComputerOrganization: Ajeenkya D Y Patil College of Engineering,Lohegaon, PuneEmail: pawar.punam08@gmail.com

Guide Name: Dr. Pankaj Agarkar.

Department: ComputerOrganization: Ajeenkya D Y Patil College of Engineering, Lohegaon, Pune.Email: pmagarkar@gmail.com

Abstract—Biometric authentication systems founded on face recognition are now embedded across a broad spectrum of real-world applications, making themhigh-value targets for spoofing attacks. Adversaries exploit artefacts ranging from two-dimensional printed photographs and replay video sequences to sculpted three-dimensional masks to deceive thesesystems. Countering such threats, face anti-spoofing (FAS)—also termed presentation attack detection (PAD)—has emerged as an indispensable safeguardwithin modern authentication pipelines. This paper presents a spatio-temporal deep learning framew ork that fuses a ResNet-50 spatial encoder with a GatedRecurrent Unit (GRU) temporal module to simultaneously capture liveness cues at both the texture and motion levels. Beyond the proposedsystem, a structured review of contemporary deep FAS methodologies is provided, covering pixel-w isesupervisory signals, domain-invarianttraining strategies, open-set evaluation protocols, and sensor-aware multi-modal architectures. Experimental results on a curated dataset of 1,250 samples yield aclassification accuracy of 96.2%, a false acceptancerate of 3.4%, and a false rejection rate of 3.8%, outperforming several recently published baseline methods. Key open challenges and prospective research directions are identified to guide further development of robust, deployment-ready FAS systems.
Keywords—Face anti-spoofing; presentation attack detection; spatio-temporal deep learning; liveness detection; multi-modal generalization.