International Scientific Journal of Engineering and Management

An International Scholarly || Multidisciplinary || Open Access || Indexing in all major Database & Metadata
The journal follows the UGC Guidelines and is evaluated for inclusion in the Web of Science
ISSN: 2583-6129

Impact Factor: 7.839

INBOX BOT: DYNAMIC SCREEN-BASED DATA EXTRACTION AND CLOUD-FREE ARCHIVING TOOL

  • Version
  • Download 10
  • File Size 510.79 KB
  • File Count 1
  • Create Date 12 June 2025
  • Last Updated 12 June 2025

INBOX BOT: DYNAMIC SCREEN-BASED DATA EXTRACTION AND CLOUD-FREE ARCHIVING TOOL

 

Authors:

1st P. Rajapandian, 2nd A Abishaya

Associate Professor, Department of computer Applications, Sri Manakula Vinayagar Engineering College

(Autonomous), Puducherry 605008, India

Post Graduate student, Department of computer Applications, Sri Manakula Vinayagar Engineering

College (Autonomous), Puducherry 605008, India abishaya12a@gmail.com

*Corresponding author’s email address: abishaya12a@gmail.com

 

ABSTRACT: This project presents a semi-intelligent system that automates the retrieval of text- based content from a web-based interface (such as Gmail or similar dashboards), processes the visual data using Optical Character Recognition (OCR), and stores it into a structured database for archiving, analytics, and reporting. The solution simulates human interaction using Python automation tools and bypasses the need for API-level access, making it adaptable to a variety of use cases such as document inboxes, web mail clients, and internal portals.

The automation pipeline involves opening the web interface in a browser, capturing the screen, and identifying key visual triggers (e.g., organizational codes, transaction keywords, or date ranges). Using PyAutoGUI, Tesseract OCR, and OpenCV, the system analyzes the layout and extracts relevant content. The extracted text is processed, tagged with metadata (such as timestamp and content type), and stored as .txt files. These files are later read by a backend PHP module, parsed based on delimiters, and inserted into a MySQL database hosted on a local XAMPP server.

This modular approach makes the system highly reusable across different domains where direct content access is restricted. It enhances traceability, enables real-time monitoring, and supports secure storage of time-sensitive information without altering the source environment. By using screen-based OCR and visual recognition, the system bypasses the need for backend integration, making it ideal for restricted or legacy platforms that do not expose direct programmatic access. This also reduces dependency on external services and ensures full control over data flow, privacy, and security. Furthermore, the system is designed to operate with minimal human intervention, which reduces the chances of manual errors and increases overall operational efficiency. The use of timestamping and categorization in the structured output also enables historical tracking and facilitates audit trails—particularly useful for applications in banking, compliance, or customer support environments. Overall, this semi-intelligent, visually driven automation tool blends simplicity, adaptability, and functional precision—serving as a reliable alternative to more complex or less secure data integration pipelines.

Keywords: Screen Automation, OCR (Optical Character Recognition), Email Content Extraction, Python Automation, Web Interface Parsing, PyAutoGUI, Tesseract OCR, OpenCV, PHP-MySQL Integration, Data Archiving, Non-API Email Processing, Gmail Automation, Visual Data Retrieval, Structured Data Storage, Local Server (XAMPP), Automated Text Extraction, Clipboard Parsing, Backend Scripting, Keyword-Based Detection, Screen-Based Data Mining


Download

Author's Blog

What is the difference between a Research Paper and a Review Paper?

A research paper and a review paper are both scholarly documents, but they serve different purposes and have different characteristics....
Read More
Author's Blog

What is DOI?

A Digital Object Identifier (DOI) is a unique alphanumeric string that is used to identify and provide a persistent link...
Read More
Author's Blog

What do you need to do during production of your Research Paper?

During the production of a research paper, the following steps need to be taken: conducting research, organizing and analyzing data,...
Read More
Author's Blog

What are the advantages of publishing a research paper?

Publishing a research paper can have many advantages for researchers, including: Career advancement, professional recognition, opportunities for collaboration, increased visibility,...
Read More
Author's Blog

Ways to Support your Academic Wellbeing which preparing the Research Paper/Article

To support your academic wellbeing while publishing a research paper, it's important to set realistic goals, manage your time effectively,...
Read More
Author's Blog

How to improve your Research Paper writing Skills?

Read extensively: One of the best ways to improve your research paper skills is to read extensively in your field...
Read More
Author's Blog

Is DOI compulsory to publish a research paper in a Journal?

DOI is not strictly required to publish a research paper, but it is highly recommended. Basically, the International Scientific Journal...
Read More
Author's Blog

In what ways does research paper give weight to career development?

Publishing a research paper can give weight to a researcher's career development in several ways, such as: establishing oneself as...
Read More
Author's Blog

How to develop a Research Paper from Scratch

Developing a research paper involves several steps including: choosing a topic, conducting background research, formulating a research question or hypothesis,...
Read More
Author's Blog

How Plagiarism report plays crucial role in Research Paper Publication?

Plagiarism is a major concern in the academic and research community, as it undermines the integrity of the research and...
Read More