Spatio-Temporal Deep Learning for Face Liveness Detection: A Resnet-GRU Approach
- Version
- Download 9
- File Size 605.18 KB
- File Count 1
- Create Date 24 March 2026
- Last Updated 24 March 2026
Spatio-Temporal Deep Learning for Face Liveness Detection: A Resnet-GRU Approach
Punam Chandrashekhar Wagale
Department: ComputerOrganization: Ajeenkya D Y Patil College of Engineering,Lohegaon, PuneEmail: pawar.punam08@gmail.com
Guide Name: Dr. Pankaj Agarkar.
Department: ComputerOrganization: Ajeenkya D Y Patil College of Engineering, Lohegaon, Pune.Email: pmagarkar@gmail.com
Abstract—Biometric authentication systems founded on face recognition are now embedded across a broad spectrum of real-world applications, making themhigh-value targets for spoofing attacks. Adversaries exploit artefacts ranging from two-dimensional printed photographs and replay video sequences to sculpted three-dimensional masks to deceive thesesystems. Countering such threats, face anti-spoofing (FAS)—also termed presentation attack detection (PAD)—has emerged as an indispensable safeguardwithin modern authentication pipelines. This paper presents a spatio-temporal deep learning framew ork that fuses a ResNet-50 spatial encoder with a GatedRecurrent Unit (GRU) temporal module to simultaneously capture liveness cues at both the texture and motion levels. Beyond the proposedsystem, a structured review of contemporary deep FAS methodologies is provided, covering pixel-w isesupervisory signals, domain-invarianttraining strategies, open-set evaluation protocols, and sensor-aware multi-modal architectures. Experimental results on a curated dataset of 1,250 samples yield aclassification accuracy of 96.2%, a false acceptancerate of 3.4%, and a false rejection rate of 3.8%, outperforming several recently published baseline methods. Key open challenges and prospective research directions are identified to guide further development of robust, deployment-ready FAS systems.
Keywords—Face anti-spoofing; presentation attack detection; spatio-temporal deep learning; liveness detection; multi-modal generalization.
Download