Overview
Key Features
- ETL Pipeline: Wrote a MongoDB to PostgreSQL ETL script singlehandedly, importing database details from the main source database (MongoDB) to the analytics database (PostgreSQL) with proper data transformation and schema mapping.
- Metabase Deployment: Deployed a Metabase application on a Docker image running on an EC2 instance, providing a user-friendly interface for non-technical users to explore data and create visualizations.
- AI-Powered Query Generator: Built a user-friendly application that takes natural language prompts and, combined with Metabase metadata, generates SQL queries to automatically produce visualizations. This allowed non-technical team members to get insights without writing SQL.
- Self-Serve Analytics: Enabled product, operations, and business teams to independently access and analyze data, reducing the bottleneck on engineering for ad-hoc queries.
Technologies Used
- MongoDB: Source database
- PostgreSQL: Analytics database
- Docker: Containerized Metabase deployment
- AWS EC2: Infrastructure hosting
- Python: ETL scripts and AI query generator
- Metabase: Business intelligence and visualization platform
Impact
- Eliminated engineering bottleneck for data requests
- Enabled non-technical stakeholders to create their own reports and dashboards
- Reduced time-to-insight from days (waiting for engineering) to minutes (self-serve)