DoorGuard-AI: A Hybrid Edge–Cloud Multi-Agent Framework for Intelligent Visitor Screening in Resource-Constrained Indian Households
DoorGuard-AI: A Hybrid Edge–Cloud Multi-Agent Framework for Intelligent Visitor Screening in Resource-Constrained Indian Households
Authors:
Mr Laukik Kande, Mr Yashovardhan Poddar, Mr Ketan Mali, Mr Shaikh Asraf
Under the Guidance of Dr Prajakta Shirke Feb, 2026-27
Department of Computer Science and Engineering School Computer Science and Engineering Sandip University [Nashik, India]
ABSTRACT
This paper presents DoorGuard-AI, a browser-native, hybrid edge–cloud intelligent doorbell system engineered specifically for the threat landscape of Indian urban and semi-urban households. Unlike commodity smart doorbells that rely exclusively on cloud inference or narrowly implement motion-triggered recording, DoorGuard-AI embeds a four-stage orchestrated agent pipeline—Perception, Intelligence, Decision, and Action—directly within a lightweight FastAPI server deployable on constrained hardware such as the Raspberry Pi 4B. The perception stage fuses the OpenThreatDetection framework for optimised threat and person detection, cloud-accelerated Whisper large-v3-turbo speech recognition with graceful VOSK offline fallback, heuristic emotion inference, and a multi-signal anti-spoof scoring mechanism. Downstream, a deterministic intent classifier resolves fourteen visitor intent categories with dangerous intents lexicographically prioritised to prevent ambiguous multi-intent phrases from down-grading threat severity. A 12-rule ordered decision engine converts composite risk scores into one of three terminal actions—auto-reply, notify-owner, or escalate—using thresholds validated against a curated set of realistic Indian doorstep scenarios including OTP scams, occupancy probing, authority impersonation, and domestic-staff entry. All conversation turns, sensor outputs, agent decisions, and action records are persisted to a structured SQLite audit trail. An owner-facing React dashboard provides real-time WebSocket event delivery, pseudo-live MJPEG camera streaming, and remote text-or-voice reply capability. Evaluation across 47 simulated scenario runs demonstrates that the system correctly escalates 94.3% of high-risk interactions, auto-resolves 91.2% of low-risk routine visits, and completes the full end-to-end pipeline within a median latency of 6.4 seconds under local WiFi conditions, with graceful degradation to rule-based canned responses when cloud services are unavailable.
Keywords: smart doorbell; multi-agent system; OpenThreatDetection; threat detection; FastAPI; edge-cloud AI; visitor screening; intent classification; Indian household security; LLM safety guardrails.