A Comparative Study of Database Indexing Techniques for Large Scale Applications

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

A Comparative Study of Database Indexing Techniques for Large Scale Applications

Version

File Size 4.06 MB

Downloads 3

Files 1

Published 12 April 2026

Updated 12 April 2026

A Comparative Study of Database Indexing Techniques for Large Scale Applications

Authors:

V Siri¹, Harshachandra V²

¹Professor, Department of Computer Science and Engineering, St. Martin’s Engineering College, Hyderabad, India vemulasiricse@smec.ac.in

²Student, Department of Computer Science and Engineering, St. Martin’s Engineering College, Hyderabad, India harshachandra.v@gmail.com

Abstract

The amount of data in systems is growing quickly, making it hard to access the information we need. Many organizations have databases with billions of records. They need to find the best way to index their data so that their systems can respond quickly, handle many users, and use resources efficiently. This study examines common methods for indexing data in databases, such as B+ Trees, Hash Indexes, Log-Structured Merge Trees, Bitmap Indexes, Inverted Indexes, and a new approach called Learned Index structures. We looked at how these methods work, what they consist of, and how well they perform in various situations like e-commerce websites, healthcare systems, financial databases, and cloud environments. We considered how well they answer questions, how much data they can manage, how much space they occupy, how they perform when many users access them simultaneously, and how they handle different types of data. We reviewed 20 studies from 2021 to 2025. Our findings showed that different indexing methods vary in effectiveness based on data volume, user access patterns, and system configurations. We learned that there isn't a single best indexing method. Instead, combining methods that consider the workload and using machine learning to enhance the indexes seems to be the best approach for large applications. This review provides insights for those who design databases and systems to help them make informed decisions about indexing their data.

Keywords: Database indexing, B+ Tree, Hash index, LSM Tree, Learned index, Query optimization, Large-scale databases, Distributed systems, Query performance