Uncovering Content Trends in Netflix’s Global Catalog: A Comprehensive Data-Driven Analysis Using Python
Uncovering Content Trends in Netflix's Global Catalog: A Comprehensive Data-Driven Analysis Using Python
Authors:
Vishwas Jatav, Gaurav Singha, Amitesh Harsh Lal
Abstract:
Netflix has grown into one of the biggest content platforms in the world, but most academic work on it focuses on algorithms or business strategy rather than what's actually in the catalog. This paper looks at that directly, using exploratory data analysis (EDA) on a publicly available snapshot of around 8,800 Netflix titles from mid-2021. We used Python — pandas, Matplotlib, and Seaborn — to examine five things: how movies and TV shows are split, how many titles were added each year, which genres show up most, which countries produce the most content, and how ratings and runtimes are distributed. The main findings are: movies still outnumber TV shows about two to one, but the TV share has been climbing since 2015; catalog additions peaked around 2019 before dropping off; non-English content is the most common genre tag; India and South Korea are among the top contributors; and over 60% of all titles carry a mature audience rating. Taken together, these findings paint a picture of a platform that started as a domestic movie library and has been steadily pushing toward a much wider, more global kind of service.
Index Terms—Netflix, exploratory data analysis, content distribution, genre analysis, data visualization, Python, pandas, streaming platforms, temporal trends, geographic diversity, content strategy