Document Image Layout Retention Techniques
Document Image Layout Retention Techniques
Mrs.B Rupa DeviSingh,M.Tech(Ph.D),
Associate ProfessorDepartment of AI&DSAnnamacharya Institute Of
Technology and SciencesTirupati-517520,A.P,rupadevi.aitt@annamacharyagr
oup.orgORCID: 0009-0005-1298737X
D Srilekha
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,srireddy7789@gmail.com
K Ujwala Sai Sree
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,Ujwalakamjula789@gmail.com
N Sreelatha
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,naresreelatha0109@gmail.com
S Venkata Nagarjuna Reddy
UG Scholar,Department of AI&DSAnnamacharya Institute OfTechnology and Sciences
Tirupati-517520,A.P,snagarjunareddy905@gmail.com
Abstract— Document layout classification is asignificant task in intelligent document analysis and digital information processing. It helps computers recognize and understand the structural components of documents such as research papers, reports, and articles. The structuralcomponents of documents include text blocks, title, tables, figures, and lists. Detection of these components is critical for applications such as Optical Character Recognition (OCR), automatic document digitization, and extraction of structured data. In this research, a deep learning approach is presented using YOLOv8, a state-of-the-art and efficient object detection model, for document layout classification. Unlike conventional approaches that are rule-based or multi-stage detection, YOLOv8 detects objects in a single pass, enhancing detection speed and efficiency. The experimental outcome shows that the proposed method has been able to achieve a mean Average Precision (mAP@0.5) of 0.916, which is a very good detection capability at the Intersection over Union threshold of 0.5.The model performed very well in the dominant classes like text and title. But in the case of less frequent classes like figures and list, the performance was relatively lower due to imbalance in the dataset. Nevertheless, the result shows that YOLOv8 is a good trade-off between accuracy and speed. Keywords—YOLOv8, Document Layout Analysis,Object Detection, PubLayNet