open to senior & staff roles
milpitas, ca · remote-friendly

Analytics Engineering Leader /
building AI-native
data systems.

15+ years shaping data platforms at Workday, Lyft, and beyond — now building the layer above the dashboard: semantic models, self-healing pipelines, and AI decision systems like Pipeline Sentinel.

15+
years in enterprise analytics
40%
reporting overhead removed at Workday
8
AI/ML systems built
1.35M
payment events modeled (LoanLens)
Previously at
about

From dashboards
to decision systems.

I started in BI when "analytics" meant Excel and Tableau. Fifteen years later, I'm building the layer above the dashboard — semantic models, agentic pipelines, and AI interfaces that turn governed data into executive decisions.

At Workday I lead BI Analytics and shipped an AI Companion that lets execs query marketing performance in natural language, grounded in our semantic layer. At Lyft I built the growth analytics stack behind multi-million-dollar acquisition spend. Before that: supply chain analytics at Intuitive Surgical, CRM transformation at Juniper.

The projects below are how I learn in public — each one is a production-grade system, not a notebook. If you're building the next generation of data/AI platforms, let's talk.

Shrikant Lambe
featured work

Eight systems. All production-ready.

8 projects
07 / sentinel
Multi-Agent · DataOps

Pipeline Sentinel — a self-healing agent for Airflow.

Five specialized agents — Monitor, Diagnosis, Blast Radius, Remediation, Reflection — coordinate through shared state to diagnose and resolve pipeline failures without paging a human. Pattern memory prevents repeat incidents. A confidence gate auto-remediates only low-risk, high-signal cases; everything else escalates with a full audit trail.

⚡ Self-heals pipeline failures in ~11s · 94% median confidence · zero human pages on low-risk incidents
5
Agents in ReAct loop
94%
Median confidence
11s
Time-to-resolution
Claude SonnetLangGraphLangSmith FastAPIAirflowStreamlit
pipeline-sentinel · agent dashboard
⚠ INCIDENT · customer_churn_pipeline · transform_features
Monitor · task failure detectedT+04s
Diagnosis · upstream schema drift on `plan_tier`T+07s
Blast Radius · 3 downstream tasks identifiedT+08s
Remediation · applying `reload_schema` strategyT+09s
🛡 self-healed · confidence 94%
Schema reloaded, 3 tasks re-queued. No human intervention. Similar incident 2026-04-03 — same fix, resolved in 11s.
08 / loanlens
Fintech · Data Platform

LoanLens — portfolio intelligence for a simulated Series C lender.

10,000 loans, 1.35M payment events, 3 SPVs. Modeled in dbt, reconciled end-to-end to <0.1% tolerance, narrated by Claude. SPV covenant monitoring, 36-cohort vintage analysis, anomaly detection, and investor-grade PDF memo export. Designed and shipped as a two-day sprint.

📊 1.35M payment events · 3 SPVs reconciled to <0.1% · investor-grade PDF memo in one click
1.35M
Payment events
<0.1%
Recon tolerance
36
Cohort vintages
Claude SonnetDuckDBdbt Core SnowflakeStreamlitPython
loanlens · spv-b portfolio snapshot
SPV-B · Q4 2025 · AS-OF 2026-04-22
$284M
AUM
+3.1% QoQ
3.2%
Default rate
+40bps
0.04%
Recon delta
within tol.
🤖 investor memo · claude
SPV-B at 87% facility utilization — recommend 60-day drawdown pause. 2023 vintage cohorts outperforming 12-month default curve by 40bps.
06 / growth-agent
Agentic AI · SaaS Analytics

Growth Intelligence Agent — a virtual revenue analyst.

Autonomously monitors SaaS growth metrics across 6 categories (new logo, expansion, retention), detects anomalies with severity scoring, and surfaces RAG-grounded strategic recommendations from company playbooks — all through a natural-language interface. Thinks like an analyst, cites like a playbook.

🤖 Surfaces playbook-grounded recommendations autonomously across 6 metric categories
6
Metric categories
FAISS
Playbook retrieval
NL
Query interface
Claude 3.5 SonnetLangChainFAISS RAGStreamlitPlotly
growth-intelligence-agent · week 17
SAAS PORTFOLIO · WEEK 17 · REAL-TIME
$2.4M
MRR
+6.2% WoW
118%
NRR
above target
↑2.1%
Churn risk
anomaly
ANOMALY · HIGH — enterprise churn spiked +2.1pp WoW · 3 accounts flagged
Why did enterprise churn jump this week?
Per ICP playbook §3.2: accounts with <60% feature adoption + no QBR in 90d = highest signal. 3 flagged accounts match both. Recommend CSM outreach this week.
05 / cortex
Snowflake · Enterprise AI

AI-Native Data Platform on Snowflake Cortex.

A production-grade architecture blueprint for running governed AI natively inside the Snowflake Data Cloud. Cortex LLM functions (COMPLETE, SENTIMENT, CLASSIFY), Cortex Search, and Cortex Analyst — with live interactive demos. Zero data egress, end-to-end policy enforcement, ships with Anthropic API integration.

🏗 Zero data egress · end-to-end governed AI · live interactive demo deployed on GitHub Pages
3
Cortex layers
0
Data egress
E2E
Governed AI
Snowflake CortexCortex Search Cortex AnalystAnthropic APISQL
snowflake-cortex-architecture
GOVERNED AI · IN-WAREHOUSE EXECUTION
-- customer feedback analysis inside Snowflake
SELECT
  customer_id,
  SNOWFLAKE.CORTEX.SENTIMENT(support_notes) AS sentiment,
  SNOWFLAKE.CORTEX.CLASSIFY(
    support_notes, ['billing', 'outage', 'feature']
  ) AS category,
  SNOWFLAKE.CORTEX.COMPLETE(
    'llama3-70b',
    CONCAT('Summarize: ', support_notes)
  ) AS summary
FROM customer_feedback
WHERE region = 'EMEA';
3
Cortex layers
0
Egress
E2E
Governed
03 / retail-pipeline
Data Engineering · Streaming

Real-Time Retail Sales Pipeline — Kafka to warehouse, E2E.

Production streaming platform: Kafka producer → PySpark Structured Streaming → Snowflake → dbt incremental MERGE → Airflow orchestration with automated data quality gates. Fully containerized via Docker Compose. System design documented as an architecture reference.

🔄 Processes 25 events/sec end-to-end · zero DQ failures · fully containerized with Docker Compose
25/s
Kafka throughput
0
DQ failures
E2E
Stream to warehouse
Apache KafkaPySparkSnowflake dbtAirflowDocker
retail-pipeline · airflow dag
PIPELINE STATUS · retail_streaming_pipeline
Kafka Producer · 25 events/seclive
PySpark Streaming · dedup + windowdone
Snowflake write · raw + agg layersdone
dbt run · fact_sales incremental MERGErunning
24.8K
Rows today
$94K
Revenue
0
DQ fails
02 / mktg-ai
GenAI · Enterprise

Marketing AI Intelligence Engine — board-ready in 30 seconds.

Inspired by real AI work at Workday. 4-layer GenAI architecture — Semantic → Context/RAG → Insight Engine → LLM Generation — that transforms raw campaign data into board-ready executive summaries. No analyst in the loop required.

⚡ Reduces executive reporting from hours to 30 seconds · inspired by production AI work at Workday
4
GenAI layers
30s
Time to summary
RAG
Context retrieval
OpenAI GPT-4RAGStreamlitPydantic
marketing-ai · q3 executive summary
Q3 CAMPAIGN PERFORMANCE · AI SUMMARY
4.2x
ROAS
above 3.5x target
$1.8M
Revenue
+23% vs Q2
↓18%
Email CTR
anomaly WoW
Generate Q3 executive summary.
Q3 delivered ROAS 4.2x, exceeding 3.5x benchmark by 20%. Paid search drove 62% of revenue. Email CTR declined 18% WoW — recommend creative refresh before Q4.
01 / churn
ML + AI · Flagship

Customer Churn Analysis & Prediction Copilot.

End-to-end ML system for telecom churn prediction — SHAP explainability per prediction, LLM-generated retention strategies per customer, and a FastAPI microservice. Inspired by real patterns from enterprise data work at Workday. CI/CD via GitHub Actions.

🎯 Per-customer SHAP explainability + LLM retention strategy in a single FastAPI call
SHAP
Explainability
FastAPI
Microservice
CI/CD
GitHub Actions
scikit-learnOpenAISHAP StreamlitFastAPIGitHub Actions
churn-analysis · risk dashboard
CUSTOMER RISK ANALYSIS · LIVE PREDICTION
72%
Churn score
HIGH RISK
7
Tenure (mo)
short tenure
$89
Monthly charges
above avg
Why is #CUS-4821 flagged high risk?
Month-to-month + tenure 7mo + charges above cohort median = 3 top SHAP signals. Recommend targeted retention offer within 48hrs.
04 / chatbot
AI · Deployed

GPT-4o AI Chatbot — multi-turn, deployed on Render.

Production web chatbot with multi-turn memory, animated typing indicators, and a responsive dark UI. Flask + Gunicorn backend deployed on Render. One-line system prompt swap for any domain — built as a reusable baseline for enterprise conversational AI.

🚀 Live on Render · multi-turn context window · one-line system prompt swap for any domain
GPT-4o
Model
Multi-turn
Memory
Render
Live deployed
FlaskOpenAI GPT-4oGunicornRender
ai-chatbot · multi-turn session
GPT-4o · MULTI-TURN CONVERSATION
What are the key drivers of customer churn?
Key drivers: month-to-month contracts, tenure under 12 months, charges above cohort median, lack of bundled services. Want retention strategies?
Yes, focus on retention strategies.
Top 3 levers: (1) proactive offer at 6-month mark, (2) bundle upgrade for high-usage, (3) CSM outreach when charges spike >20% MoM.
writing

The Data Brief — strategy, data, decisions.

The Data Brief.
strategy · data · decisions

A newsletter for data engineers, analytics engineers, and data leaders navigating the shift to AI-native stacks. Grounded in what ships — not what's on the conference slide.

15
essays published
3
platforms
Subscribe on Substack
AI toolsapr 17, 2026 · 10 min
How to Build Your First Claude Skill in 10 Minutes
How to encode your workflow knowledge into a reusable Claude Skill — a markdown file that turns any process you understand into something Claude executes reliably.
semantic layerapr 14, 2026 · 11 min
The Semantic Layer: Honest Tool Guide
A no-fluff comparison of dbt Semantic Layer, Cube, LookML, AtScale, and MetricFlow — which one fits your stack and why.
agentsapr 7, 2026 · 13 min
Agentic Data Infrastructure: I Built a Self-Healing Data Pipeline System
How I wired LangGraph agents to monitor, diagnose, and self-heal a production data pipeline — replacing reactive Slack paging with proactive remediation.
semantic layerapr 1, 2026 · 9 min
The Semantic Layer: Define It Once. Mean It Everywhere.
The case for treating metric definitions like code: version-controlled, tested, and governed once so every downstream consumer — including LLMs — agrees.
semantic layermar 26, 2026 · 8 min
The Semantic Layer: Which Flavor Fits?
Centralized vs. embedded vs. push-down — the three semantic layer architectures, what each trades off, and when to pick each.
semantic layermar 24, 2026 · 7 min
The Semantic Layer: Why Now?
Why the semantic layer matters more in an AI-native stack than it ever did in the BI-only era — and why skipping it in 2026 is a compounding mistake.
semantic layermar 17, 2026 · 8 min
The Semantic Layer: Your Data's Translator for the Real World
What the semantic layer actually does: translating business concepts into governed, consistent data that any tool — including LLMs — can trust.
agentsmar 12, 2026 · 12 min
I Built a Growth Intelligence Agent That Acts as a Virtual Revenue Analyst
What it took to build a multi-agent system that monitors SaaS metrics, detects anomalies, and surfaces playbook-grounded recommendations — without a human in the loop.
enterprise AImar 10, 2026 · 10 min
What It Actually Takes to Run AI Natively Inside a Data Warehouse
Zero-egress governed AI on Snowflake Cortex: LLM functions, Cortex Search, Cortex Analyst, and the Anthropic API — all inside the data cloud.
data engineeringmar 9, 2026 · 11 min
How I Built a Production-Grade Real-Time Retail Sales Data Pipeline
Kafka, PySpark Structured Streaming, Snowflake MERGE, dbt incrementals, Airflow DQ gates. Containerized top to bottom.
ML + LLMsmar 3, 2026 · 14 min
How I Turned a Traditional Churn Prediction Model into an AI-Powered Decision Copilot
Adding an LLM layer to scikit-learn + SHAP — what changed, what got harder, and why the "copilot" metaphor actually holds.
GenAIfeb 16, 2026 · 9 min
From Dashboards to Dialogues: What Building a Simple Chatbot Taught Me About the Future of BI
Lessons from building a GPT-4o chatbot: what the AI gets right, what it still can't do, and why the interface shift is more profound than it looks.
strategyjan 24, 2026 · 8 min
From Data-Driven to AI-Driven Decision Making
The shift from querying data to trusting agents — what it means for data teams, and which mental models need to change first.
strategyjan 13, 2026 · 7 min
Why AI Won't Fix Your Messy Metrics (and Might Make Them Worse)
Garbage in, AI-amplified garbage out. How undefined metrics become a compounding liability in AI pipelines, not just an analytics oversight.
GenAIdec 2025 · 9 min
How I Built a GenAI-Powered Executive Summary Generator for Marketing Organizations
4-layer architecture: Semantic → Context/RAG → Insight Engine → LLM Generation. Campaign data to board-ready summaries in 30 seconds.
experience

15 years · 6 companies · one arc.

2020 — now

Workday

Manager, BI Analytics (promoted from Sr. BI Analyst, 2023)

Shipped an AI Companion enabling natural language querying of enterprise marketing metrics — grounded in our semantic layer, with governed, context-aware responses. Built the AI-enabled executive summary engine synthesizing campaign performance into board-ready narratives (direct inspiration for the Marketing AI project). Led the Analytics Modernization Program — migrated Tableau to Sigma on Snowflake/dbt, established governed self-service for global Sales & Marketing. Institutionalized forecasting that reduced manual reporting ~40%. Led 8+ analysts & engineers.

Remote
2018 — 2020

Lyft

Senior BI / Data Engineer — Growth Analytics

Built the data infra behind multi-million-dollar marketing spend across acquisition channels. Designed campaign attribution, funnel analytics, and churn frameworks enabling data-driven budget reallocation. Owned CAC, ROI, retention, LTV metric definitions.

San Francisco
2016 — 2018

Intuitive Surgical

Senior BI / Data Engineer — Supply Chain

Modernized analytics in a regulated medical-device environment — migrated Excel processes to production pipelines and governed dashboards for efficiency, lead times, defect rates, and capacity planning.

Sunnyvale
2015 — 2016

ServiceNow

Lead BI Analyst — cross-functional analytics across Finance, Sales, HR, Product; standardized metrics and eliminated recurring manual reports.
Santa Clara
2014 — 2015

Juniper Networks

Lead BI Developer — reporting strategy during SAP CRM migration; executive dashboards, SIT/UAT coordination, user enablement.
Sunnyvale
2006 — 2014

Tech Mahindra

Project Lead — cross-functional teams of 8–20 across global delivery programs; client-facing technical lead on enterprise engagements.
India
stack

Full lifecycle — warehouse to LLM.

Core stack · daily use
Snowflake dbt Python SQL Claude API LangChain / LangGraph Airflow Streamlit Sigma
Proficient
PySpark Apache Kafka FastAPI SHAP FAISS / Vector DBs GitHub Actions Docker scikit-learn OpenAI GPT-4 Tableau DuckDB LangSmith
◆ get in touch

Let's build something that ships.

Actively talking to teams building the next layer of the data stack — AI-native analytics, agentic DataOps, fintech data platforms, enterprise LLM infrastructure.

analytics engineering lead AI engineering manager senior analytics engineer