ARTIFY: A Unified Generative AI Platform for Integrated Text-to-Image Synthesis and Guided Neural Style Transfer with Background Isolation
ARTIFY: A Unified Generative AI Platform for Integrated Text-to-Image Synthesis and Guided Neural Style Transfer with Background Isolation
Authors:
Balla Jhansi, Akkajousyla Vinodhini, Mokshayagna Sai Kumar Gompa, Mailapalli Naveen Department of Information Technology, MVGR College of Engineering (Autonomous), Vizianagaram, India
Guide: Dr. M. Chandra Sekhar, Associate Professor, Department of Information Technology
Abstract—Contemporary generative AI tools predominantly serve either prompt-driven image creation or artistic transformation pipelines— rarely both within a coherent, production-ready framework. This paper introduces ARTIFY, a hybrid generative platform that consolidates text- to-image synthesis, neural style transfer, and foreground isolation into a single, unified service architecture exposed through a lightweight FastAPI microservice. The synthesis pathway employs Stable Diffusion XL (SDXL), conditioned on composite natural language prompts enriched with user-supplied artistic style tokens and mandatory quality anchors. The stylization pathway leverages ControlNet guided by Canny edge maps to superimpose selected artistic aesthetics onto uploaded photographs while enforcing spatial structural fidelity. An auxiliary foreground-isolation component, realized through the Rembg library and the U²-Net salient object detection network, enables transparent-background PNG export for compositing workflows. Systematic evaluation across three distinct artistic style categories—Oil Paint, Watercolor, and Cyberpunk—on 120 curated test images demonstrates a mean precision of 1.00, recall of 0.92, and F1-score of 0.96, with Top-1 accuracy of 93% and Top-3 accuracy of 99%. Comparative capability benchmarking confirms that ARTIFY is uniquely positioned as the sole evaluated platform simultaneously delivering text-to-image generation, guided neural style transfer, background removal, and a unified programmatic API interface. These findings establish the technical viability and practical superiority of integrated multi-capability generative platforms across creative, educational, and commercial content production domains.
Index Terms—Generative AI, Stable Diffusion XL, Neural Style Transfer, ControlNet, Canny Edge Detection, FastAPI, Diffusion Models, Background Removal, U²-Net, Deep Learning, Vision-Language Models.