Hello, I'm

Pratik Bhikadiya.

> Data Engineer

I build production-grade lakehouse pipelines — Bronze/Silver/Gold on Databricks, real-time clickstream on Kafka+EMR, and SLA-monitored analytics for Intuit TurboTax. 5+ years turning raw events into trusted data products.

Explore Work Resume GitHub LinkedIn

Certified

Databricks Data Engineer Azure DP-203 Power BI PL-300

Years

1PB+

Data

Certs

medallion-lakehouse.flow streaming

pipeline: healthy

Delta Lake ACID

Daily driver stack

PySparkDatabricksDelta LakeAWSAzureAirflowKafkaSQLPythondbt

How I work

Reading the driver logs

Most of my day is spent reading Spark driver logs, chasing skewed joins, and tuning shuffle partitions.

This is a simulated feed showing the kind of events I see on a real ingestion job — executor starts, stage completions, shuffle spills, skew warnings, Delta commits. Click pause if it moves too fast.

Avg job

7.4 min

▼ 35% after tuning

SLA hit rate

99.5%

last 90 days

Peak throughput

2.4 GB/s

shuffle read

Cluster

EMR 6.x

Spark 3.4

spark-driver@emr-cluster:~$ tail -f ingest.log

job: streaming-bronze-to-silver

Exec: 8 · Cores: 32

* simulated driver log · events generated client-side for demo

01 · About

A little about me

I'm a Data & Analytics Engineer who enjoys the craft of turning messy source data into reliable, well-modeled data products that stakeholders actually trust.

Currently at Intuit TurboTax, I architect DataMart pipelines for Marketing, Finance, Product and Sales — Bronze/Silver/Gold on Databricks, Spark SQL on EMR, and SLA monitoring on Databricks Workflows & CloudWatch.

Before Intuit: BI Engineering at ArcelorMittal (Azure Databricks / ADF / Synapse) and BI Analytics at Adani Ports & SEZ — 20+ executive dashboards powering port operations, logistics, and finance.

"Good data engineering is invisible. You only notice it when it fails."

At a glance

Role Data / Analytics Engineer
Current Intuit TurboTax
Based in Brampton, ON 🇨🇦
Education M.M. — Business Data Analytics
Domains Finance · Tax · Supply Chain
Stack Databricks · PySpark · AWS · Azure

rows processed · career (est.)

Years

Experience

50+

Dashboards

Delivered

15+

Pipelines

Shipped

99.5%

SLA

Hit rate

02 · Experience

Where I've shipped

A career timeline across three industries — fintech, heavy industry, and logistics. Different domains, same commitment to trustworthy data.

IT
Data Engineer / Analytics Engineer

Intuit TurboTax

Apr 2024 – Present
Brampton, ON
- Architected enterprise DataMart pipelines (Marketing, Finance, Product, Sales) on Databricks + PySpark, reducing end-to-end latency by 35%.
- Designed scalable clickstream ingestion (Kafka → EMR/Databricks) processing millions of daily events.
- Implemented Bronze-Silver-Gold medallion architecture with incremental Delta processing and schema enforcement.
- Built automated SLA monitoring (Databricks Workflows + CloudWatch), improving MTTR by 40%.
DatabricksPySparkDelta LakeKafkaEMRAWS
AM
BI Engineer

ArcelorMittal Nippon Steel India

Apr 2022 – Dec 2022
India
- Developed Spark transformation pipelines on Azure Databricks integrating Azure SQL, Blob Storage & Synapse.
- Orchestrated enterprise ETL workflows in Azure Data Factory, cutting manual intervention by 40%.
- Engineered curated data mart layers with referential integrity controls for monthly executive reporting.
Azure DatabricksADFSynapseSpark SQL
AP
BI Analyst

Adani Ports and SEZ Ltd.

Jun 2016 – Mar 2022
India
- Built large-scale ETL workflows and warehouse schemas supporting port operations, logistics & finance.
- Improved data processing efficiency by 20% via SQL optimization while maintaining 99.5% SLA compliance.
- Delivered 20+ executive dashboards (Power BI / Tableau) tracking operational KPIs, revenue & cargo throughput.
SQLPower BITableauETL

Education

Degrees · Honors

2023 – 2024

Master of Management (Honors) — Business Data Analytics

University of Windsor, Canada

2020 – 2021

PG Diploma (Honors) — Marketing Management

Maharaja Sayajirao University

2012 – 2016

B.Tech (Honors) — Engineering

Maharaja Sayajirao University

03 · Projects

Selected work

Real, end-to-end data products — each with its own architecture, metrics, and GitHub link. Hover a card to see the dataflow pulse.

All repos on GitHub

medallion

Ontario Real Estate Lakehouse

/01

Production-grade Medallion pipeline on 1.2M civic records.

Bronze/Silver/Gold lakehouse pipeline for Ontario real estate analytics, ingesting 1.2M+ authentic records from Toronto Open Data and Statistics Canada into a Streamlit dashboard.

PySparkDelta LakeDatabricksStreamlitPython

Sole Engineer — Design, build, deploy Case study

rag

AI Stock Intelligence — CAN/US Markets

/02

Multi-source ingestion + RAG on 40 stocks across TSX/NYSE/NASDAQ.

End-to-end GenAI-powered analytics platform: ingests 8+ financial and social sources, scores stocks via a 5-factor composite model, and answers natural-language questions through a RAG-powered Streamlit dashboard.

PythonDelta LakeAirflowChromaDBGoogle Gemini

Sole Engineer — architecture, pipelines, UI Case study

azure

F1 Azure Databricks Lakehouse

/03

End-to-end Azure lakehouse with Unity Catalog governance.

Formula 1 analytics pipeline on Azure Databricks: ADF orchestration, ADLS Gen2 storage, Delta Lake lakehouse, Unity Catalog governance, and Power BI reporting.

Azure DatabricksPySparkSpark SQLADLS Gen2Azure Data Factory

Sole Engineer Case study

analytics

Multi-Channel Marketing Analytics

/04

Unified KPI model across Facebook, Google Ads, and TikTok.

Analytics engineering project: a single standardized data model and interactive dashboard unifying Facebook / Google / TikTok ad spend, with built-in data-quality checks and ROI-driven KPIs.

3 (FB / Google / TikTok)

Channels

CTR · CPC · CPA · ROAS · CPM

KPIs modeled

Streamlit Cloud

Deploy target

PythonPandasStreamlitPlotly

Sole Engineer — model, UI, QC Case study

04 · Skills