Enterprise Reference Architecture

AI-Native Data Platform
on Snowflake Cortex

A production-grade enterprise architecture blueprint demonstrating how Fortune 500 organizations can deploy governed, scalable AI natively on the Snowflake Data Cloud — from ingestion to inference, without data leaving the platform.

↓ Explore Architecture ▶ Live Cortex Demo ⎆ GitHub
5
Architecture layers
100%
Native Snowflake — no data egress
4
Enterprise AI patterns
<30s
LLM insight generation
End-to-End Data Flow
🏢
Source Systems
🔄
Kafka / Fivetran
❄️
Snowflake Raw
🔁
dbt Transform
🧠
Cortex AI
📊
Sigma / BI
👤
Executive / Ops
02 — Architecture

5-Layer Reference Architecture

Click any component to explore its Snowflake-native implementation, SQL patterns, and enterprise design decisions.

Ingestion
🔄
Kafka → Snowpipe
Streaming ingestion, auto-ingest triggers
Native
📡
Fivetran / ADF
300+ connectors, CDC replication
External
🔁
dbt Core / Cloud
Modular transforms, lineage, tests
Native
Dynamic Tables
Declarative streaming materialization
Cortex
Storage
🥇
Medallion Architecture
Bronze → Silver → Gold zones
Native
🧊
Iceberg Tables
Open format, multi-engine access
Native
📄
Unstructured Data
Documents, PDFs, images via Stages
Cortex
🤝
Data Sharing
Live shares, Marketplace, Clean Rooms
Native
AI Engine
🧠
Cortex LLM Functions
COMPLETE, SENTIMENT, SUMMARIZE, CLASSIFY
Cortex
🔍
Cortex Search
Hybrid vector + keyword retrieval (RAG)
Cortex
💬
Cortex Analyst
Natural language to SQL, semantic model
Cortex
📈
ML Functions
FORECAST, ANOMALY_DETECTION, CLASSIFICATION
Cortex
Governance
Data Quality Gates
dbt tests + DMF + Observability
Governed
🕸️
Lineage & Catalog
Column-level lineage, Horizon catalog
Governed
🔐
RBAC + Row Security
Role hierarchy, row-access policies
Governed
💰
Cost Governance
Resource monitors, warehouse sizing
Governed
Consumption
📊
Sigma Computing
Self-service, governed analytics
BI Layer
🖥️
Streamlit in Snowflake
AI-powered apps, no data movement
Cortex
🔌
API Integration
Snowpark, FastAPI, webhook consumers
Native
🔔
Alerts & Notifications
Snowflake Alerts → Slack / Email
Native
03 — Live Demo

Cortex AI in Action

Powered by the Anthropic API, these demos simulate how Snowflake Cortex LLM functions would operate natively in SQL — the same patterns I've architected and deployed at enterprise scale.

SNOWFLAKE.CORTEX.COMPLETE — Executive Summary
Output — AI Executive Summary
// Awaiting input... Simulates SNOWFLAKE.CORTEX.COMPLETE(model, prompt)
CORTEX.SENTIMENT + CLASSIFY — Unstructured Analysis
Output — Structured Results
// Awaiting input... Simulates SQL: SELECT SNOWFLAKE.CORTEX.SENTIMENT(feedback), SNOWFLAKE.CORTEX.CLASSIFY(feedback, categories) FROM customer_feedback
Cortex Analyst Pattern — Natural Language → Snowflake SQL
Generated SQL Output
-- Awaiting natural language input...
04 — Governance

Enterprise Data Governance Framework

Governance is not a layer — it's woven through every tier. This framework ensures AI outputs remain trustworthy, auditable, and compliant at enterprise scale.

🔐

Identity & Access

Hierarchical RBAC with functional roles mapped to personas. Row-access and column masking policies applied at the table level — transparent to consumers.

RBAC + RLS
Zero-trust data access model
🕸️

Lineage & Cataloging

Automated column-level lineage via Snowflake Horizon. Every AI-generated output traces back to source tables, transformation logic, and LLM model version.

100%
Column-level lineage coverage

Data Quality

dbt schema tests + Snowflake Data Metric Functions run on every pipeline execution. Quality scores surface in Sigma dashboards; failures trigger Slack alerts.

<0.1%
Target null / anomaly rate in Gold layer
🤖

AI Output Trust

LLM responses are stored with model version, temperature, and prompt hash. SHAP-style attribution links AI insights to underlying data signals — explainability by default.

Auditable
Every AI response is traceable
💰

Cost Governance

Resource monitors at account, warehouse, and user level. Auto-suspend rules, query result caching, and materialization strategies minimize credit consumption.

35–60%
Typical credit reduction via optimization
📜

Compliance & Privacy

Dynamic data masking for PII fields. Differential privacy patterns for sensitive aggregations. SOC2, HIPAA, and GDPR-ready architecture with audit logging enabled.

SOC2
HIPAA · GDPR · CCPA ready
05 — Patterns

Four Enterprise AI Patterns

Reusable architectural patterns I've designed and deployed — each addressing a distinct enterprise AI use case on Snowflake.

01 · GenAI

Executive Narrative Engine

Structured data (KPIs, metrics) flows from Gold layer through a prompt engineering layer to Cortex COMPLETE. Output is governed, stored, and surfaced in executive dashboards. Inspired by production work at Workday.

Cortex COMPLETE Prompt Templates Sigma Output Hallucination Guards
02 · RAG

Enterprise Knowledge Search

Internal documents, contracts, and tickets are chunked and embedded via Cortex Search. Users query in natural language; relevant chunks are retrieved and passed to Cortex COMPLETE — all within Snowflake's security perimeter.

Cortex Search Vector Embeddings Hybrid Retrieval Streamlit UI
03 · NL2SQL

Governed Conversational Analytics

Cortex Analyst exposes a semantic model layer — business users ask questions in plain English and receive verified SQL + results. Role-based filtering ensures each persona sees only their authorized data.

Cortex Analyst Semantic Model RBAC Passthrough Verified Queries
04 · ML

Predictive Intelligence Layer

Snowflake ML Functions (FORECAST, ANOMALY_DETECTION) run directly on warehouse data — no model export, no MLOps overhead. Predictions land back in governed Gold tables for downstream consumption.

ML FORECAST Anomaly Detection Feature Engineering Zero Egress