About me (Registered since 28/09/2025)
I bring 5 years of Data Engineering experience with Python, SQL, Spark, Airflow, AWS (Redshift, S3, RDS, EC2), Kafka, & GitHub CI/CD. Recently, built AI/ML driven pipelines with LangChain, RAG, vector DBs.
I’m highly interested in the Data Engineering role.
CV: tiny.cc/Darad
Portfolio
Education
- May 2016 - April 2020
Work Experience
-
February 2025 - Present
NEXT Ventures, Bangladesh
Senior Data Engineer
● Architected and implemented a cloud-native Data Warehouse on Amazon Redshift, replacing AWS RDS PostgreSQL, delivering 3× faster analytics and cut infrastructure costs by ~$585/month.
● Built a scalable, cost-optimized Data Lake on Amazon S3, reducing storage spend by $400–$600/month.
● Introduced Apache Airflow as the orchestration backbone, eliminating 7 legacy servers, unifying ETL operations, and cutting debugging time by ~60%. Developed 120+ DAGs, including custom python modules for Redshift, Postgres, S3.
● Integrated GitHub with Airflow for CI/CD of data pipelines, enabling automated DAG deployment and version control.
● Developed “TableTalk”, an internal ChatGPT-style Data Warehouse assistant leveraging RAG (Retrieval-Augmented Generation) with FastAPI (Python) backend, React/Node.js frontend, LangChain orchestration, Nebius LLM API, and pgvector embeddings, enabling natural language-to-SQL translation with department-level access control. Adopted by 30+ internal users across 5 departments, reducing SQL request backlog by ~40%.
● Automated Redshift/PostgreSQL metadata ingestion into a semantic vectorized knowledge base, enabling semantic search and RAG-powered query generation. Reduced ad-hoc reporting turnaround from 1–2 days to under 2 hours.
● Authored 170+ PL/SQL stored procedures and queries on Redshift and PostgreSQL to build robust ETL pipelines.
● Prototyped a Parquet-based ingestion system using Amazon S3 and Redshift’s COPY command, reducing bulk load latency by ~80% and providing a scalable alternative to traditional row-by-row inserts.
● Designed and implemented a centralized purging system across EC2, Redshift, PostgreSQL (Airflow metadata), and S3, enabling automatic retention policy enforcement and optimizing compute/storage lifecycle costs. -
August 2022 - February 2025
Robi Axiata Limited, Bangladesh
Specialist, Data Engineering
● Utilized Shell Script, Python, and PL/SQL to enhance the Data Pipeline of ETL flow for 54.5M User.
● Developed Data Import Export Framework to transfer terabytes of data to/from multiple servers & database daily.
● Ensured the ETL flow for processing 50B record per day in Hive over Hadoop in case of any operational incident.
● Oversaw the data migration of 400M records from Oracle to Cloudera Big Data platform & vice versa using Sqoop jobs.
● Performed regular operation of Enterprise Data Warehouse & Analytics to ensuring data availability in Data Mart.
● Created a centralized alerting and monitoring framework (Python based) for critical workflows, table freshness, and mount point outages, supporting robust SLAs and increasing system observability.
● Troubleshoot daily operational failure of Near Real Time ETL, Live Loading, DB performance, Data Dictionary Storage for Business Intelligence & Data Warehouse. -
March 2021 - June 2022
Markopolo.ai, , United States
Machine Learning Engineer
● Developed a NLProc based real-time ML-powered brand monitoring platform, performing Sentiment Analysis,
Brand Awareness, Audience Profiling, Competitive Analysis and NPS tracking from raw data.
● Developed REST APIs with Django to serve ML models and integrate results into enterprise dashboards and client apps.
● Leveraged NLP frameworks (facebook/bart-large-mnli) for intent recognition and multilingual data processing.
● Implemented large-scale web scraping pipelines using python module (Selenium, BeautifulSoup, facebook-scraper), ensuring clean and production-ready datasets.
Certifications & Licences
- 2016