When working with Snowflake as a Data Engineer, you’ll often need to move data between your local machine, cloud storage, and Snowflake tables. This is where Stages come into play.
Stages in Snowflake are storage locations used for loading/unloading data. They allow you to:
Upload local files.
Connect to cloud storage (AWS S3, GCS, Azure).
Manage data movement efficiently without exposing credentials.
Let’s dive into the types of stages and their commands.
Every Snowflake user automatically gets a private stage.
Use Case: Quick testing or uploading files directly from your local.
Command:
PUT file://C:/data/employees.csv @~;
LIST @~;
COPY INTO my_table FROM @~ FILE_FORMAT=(TYPE=CSV);
Each table has a dedicated stage.
Use Case: Loading data directly tied to a specific table.
Command:
PUT file://C:/data/employees.csv @%employees;
LIST @%employees;
COPY INTO employees FROM @%employees FILE_FORMAT=(TYPE=CSV);
Named stages are explicitly created and can point to internal or external storage.
Use Case: Production pipelines, cloud integration.
Command:
CREATE STAGE my_stage URL=’s3://mybucket/raw_data/’
CREDENTIALS=(AWS_KEY_ID=’xxx’ AWS_SECRET_KEY=’yyy’);
LIST @my_stage;
COPY INTO sales FROM @my_stage FILE_FORMAT=(TYPE=CSV);
Why Use Stages?
✔️ Simplify ingestion from cloud storage.
✔️ Enable reusability across multiple pipelines.
✔️ Secure data access with role-based policies.
✔️ Handle bulk data efficiently.
Advanced Tips
Use REMOVE to clear staged files.
Use DESC STAGE to see properties.
External stages integrate well with Snowpipe for auto-ingestion.
✅ Conclusion
Stages are the backbone of Snowflake’s data movement. By mastering user, table, and named stages, you can design robust pipelines, speed up ingestion, and ace interviews.