Top 20 Snowflake Interview Questions & Answers (2025)
Below are the 20 most commonly asked Snowflake interview topics — explained clearly and practically.
1. What is Snowflake’s architecture and why is it unique?
Snowflake uses a multi-cluster shared data architecture, separating compute, storage, and cloud services.
- Storage sits in cloud object storage (S3/GCS/Azure Blob).
- Compute uses virtual warehouses.
- Cloud services handle security, metadata, optimization, etc.
This separation allows independent scaling, which is Snowflake’s biggest advantage.
2. Explain Virtual Warehouses and how scaling works.
A Virtual Warehouse is compute used for query processing.
Supports:
- Scaling Up → increase size (S → M → L)
- Scaling Out → multi-cluster (auto-scale for concurrency)
Warehouses can auto-suspend and auto-resume to save credits.
3. Difference between Cloning vs Zero-Copy Cloning.
Snowflake always uses zero-copy cloning — meaning it creates a metadata pointer, not a physical copy of data.
You can clone tables, schemas, databases instantly without extra storage.
4. What is Time Travel?
Allows accessing historical data (deleted/updated/overwritten) within a retention window (1–90 days).
Syntax:
SELECT * FROM table AT (OFFSET => -60*60);
5. How does Fail-safe work?
Fail-safe provides a 7-day recovery window after Time Travel expires.
It’s managed by Snowflake internally and cannot be queried by users.
6. What is Micro-Partitioning?
Snowflake stores data in immutable micro-partitions (50–500 MB), automatically sorted and compressed.
This enables efficient pruning → better performance and lower cost.
7. Explain Caching Layers in Snowflake.
Snowflake has 3 major caches:
- Result Cache → returns results instantly (if query repeated)
- Metadata Cache → micro-partition boundary metadata
- Local Disk Cache → cached data on warehouse nodes
Caching improves speed without extra cost.
8. External vs Internal Stages.
- Internal Stage → inside Snowflake (named stage, table stage)
- External Stage → cloud storage (S3, GCS, Azure)
Used for ingesting or unloading files.
9. What is a Stream?
Streams track changes (CDC) on tables — inserts, updates, deletes.
Used with Tasks for incremental ingestion.
10. What is a Task?
Tasks schedule SQL execution (like cron jobs).
Used to automate pipelines, e.g. incremental loads with Streams.
11. Explain Snowpipe and how it works.
Snowpipe enables continuous data ingestion using:
- Event notifications (S3, GCS, Azure)
- COPY statements triggered automatically
It loads data in micro-batches.
12. Role-Based Access Control (RBAC)
Snowflake follows hierarchical RBAC:
- Users
- Roles
- Privileges
- Objects
Best practice: assign roles → roles → users (never direct privileges to users).
13. Transient Tables vs Temporary Tables.
- Temporary Table → session-level, auto-deleted
- Transient Table → permanent but no Fail-safe
Used for staging tables to reduce cost.
14. Secure View vs Normal View.
- View exposes underlying SQL and optimizations
- Secure View hides underlying logic and prevents exposure of raw data
Used for compliance and data sharing.
15. What is Query Profile?
A visualization tool that shows:
- data scanned
- partitions pruned
- execution steps
- bottlenecks
Used for performance tuning.
16. What are Clustering Keys? When should you use them?
Snowflake auto-clusters by default, but clustering keys help when:
- Large tables (>1B rows)
- Poor pruning
- Uneven access patterns
Don’t overuse — clustering costs credits.
17. How does Snowflake charge customers?
Snowflake pricing includes:
- Storage (TB/month)
- Compute (credits per hour)
- Cloud Services (small overhead)
- Data Transfer (for cross-region)
Compute is the major cost.
18. External Tables — where to use them?
Used when querying data directly from cloud storage without loading into Snowflake.
Useful for:
- Data lakes
- Raw logs
- Large unstructured datasets
19. What is a Materialized View?
A Materialized View stores precomputed results.
Best for frequently-run queries.
Limitations:
- Costly to maintain
- Does not support all SQL (e.g., joins limited)
- Not ideal for high-churn tables
20. Best practices for Snowflake performance tuning
- Use proper warehouse size
- Avoid SELECT *
- Load semi-structured data using VARIANT
- Use result cache wisely
- Apply clustering only where needed
- Partition pruning-aware SQL
- Prefer streams + tasks over manual ETL logic