Document Image Layout Retention Techniques

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

Document Image Layout Retention Techniques

Version

File Size 569.17 KB

Downloads 54

Files 1

Published 9 April 2026

Updated 9 April 2026

Document Image Layout Retention Techniques

Mrs.B Rupa DeviSingh,M.Tech(Ph.D),
Associate ProfessorDepartment of AI&DSAnnamacharya Institute Of
Technology and SciencesTirupati-517520,A.P,rupadevi.aitt@annamacharyagr
oup.orgORCID: 0009-0005-1298737X
D Srilekha
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,srireddy7789@gmail.com
K Ujwala Sai Sree
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,Ujwalakamjula789@gmail.com

N Sreelatha
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,naresreelatha0109@gmail.com
S Venkata Nagarjuna Reddy
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,snagarjunareddy905@gmail.com

Abstract— Document layout classification is asignificant task in intelligent document analysis and digital information processing. It helps computers recognize and understand the structural components of documents such as research papers, reports, and articles. The structuralcomponents of documents include text blocks, title, tables, figures, and lists. Detection of these components is critical for applications such as Optical Character Recognition (OCR), automatic document digitization, and extraction of structured data. In this research, a deep learning approach is presented using YOLOv8, a state-of-the-art and efficient object detection model, for document layout classification. Unlike conventional approaches that are rule-based or multi-stage detection, YOLOv8 detects objects in a single pass, enhancing detection speed and efficiency. The experimental outcome shows that the proposed method has been able to achieve a mean Average Precision (mAP@0.5) of 0.916, which is a very good detection capability at the Intersection over Union threshold of 0.5.The model performed very well in the dominant classes like text and title. But in the case of less frequent classes like figures and list, the performance was relatively lower due to imbalance in the dataset. Nevertheless, the result shows that YOLOv8 is a good trade-off between accuracy and speed. Keywords—YOLOv8, Document Layout Analysis,Object Detection, PubLayNet