ETL stands for Extract, Transform, Load, a process used in data integration, especially in data warehousing and business intelligence. Here’s a simple breakdown: Extract: Data is taken from various source systems, like databases, files, APIs, etc. Transform: The extracted data is cleaned, formatted, and transformed into a suitable structure. Load: The transformed data is then loaded into a target system, usually a data warehouse, for analysis and reporting. This process ensures that data from multiple sources is combined and prepared in a usable format.
Dataset from Kaggle - The dataset consists of meta details about the movies and tv shows such as the title, director, and cast of the shows / movies. Details such as the release year, the rating, duration etc. As the first step, let's load the dataset, create some new features. In this kernel, I have analysed this dataset to find top insights and findings.
CSV File
Python Code
Database - Sqlite