Snowflake has become the go-to data cloud for enterprises due to its scalability, flexibility, and support for diverse data formats. Its cloud-native architecture eliminates much of the heavy lifting that traditional data warehouses required, such as index management or vacuuming. However, this does not mean Snowflake is automatically optimized in every situation. As workloads grow, performance tuning becomes critical for keeping query response times fast and ensuring costs remain predictable.
In this blog, we’ll explore 8 performance tuning techniques every Snowflake data engineer should know. These strategies cover everything from warehouse management and caching to query optimization and storage best practices. By the end of this blog, you’ll have a solid framework for tuning Snowflake environments without overcomplicating operations.
Snowflake uses virtual warehouses for compute, and choosing the right warehouse size can make or break performance. Unlike on-premise systems, you don’t need to overprovision. You can scale up (increase warehouse size) or scale out (add clusters) depending on the workload. Each warehouse size offers different benefits:
Best Practice: Enable auto-suspend and auto-resume. This prevents warehouses from consuming credits while idle and restarts them seamlessly when queries arrive.
Batch processing involves collecting data over a period and processing it as a single batch at scheduled intervals, such as hourly, daily, or weekly.
Snowflake caches query results for 24 hours. If the underlying data hasn’t changed, rerunning the same query instantly returns results from the cache at zero cost.
Use Case:
Pro Tip: Educate business users to rely on cached results instead of refreshing dashboards unnecessarily. This dramatically reduces both costs and latency.
Snowflake organizes data into micro-partitions, each containing around 16MB of compressed data. Query performance depends heavily on how efficiently partitions are pruned.
If your queries often filter on columns such as DATE, CUSTOMER_ID, or REGION, defining a clustering key can reduce the number of partitions scanned.
Example:
ALTER TABLE sales CLUSTER BY (order_date);
Best Practice: Run SYSTEM$CLUSTERING_INFORMATION to evaluate clustering quality. Periodically recluster large tables if pruning efficiency drops.
Snowflake stores data in a columnar format, which means scanning fewer columns reduces I/O and speeds up queries.
Inefficient:
SELECT * FROM orders;
Efficient:
SELECT customer_id, order_date, total_amount
FROM orders
WHERE order_date >= ‘2025-01-01’;
Best Practice: Avoid SELECT * in production queries, especially on wide tables. Not only does it increase query costs, but it also makes pipelines brittle when new columns are added.
Snowflake automatically prunes micro-partitions when queries contain selective filters. However, the efficiency of pruning depends on how you write conditions.
Less Efficient:
WHERE YEAR(order_date) = 2025;
This forces Snowflake to evaluate all partitions before applying the function.
More Efficient:
WHERE order_date BETWEEN ‘2025-01-01’ AND ‘2025-12-31’;
Best Practice: Avoid wrapping filter columns with functions (YEAR(), CAST(), etc.). Instead, rewrite queries using direct comparisons or ranges.
If your queries repeatedly perform complex joins or aggregations, consider using materialized views. These views precompute and store query results, drastically reducing runtime for recurring workloads.
Example:
CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT order_date, SUM(total_amount) AS daily_sales
FROM orders
GROUP BY order_date;
Use Case: Perfect for BI dashboards and real-time reporting where performance is critical.
Best Practice: Monitor refresh costs. Materialized views update automatically as source tables change, so reserve them for queries where the performance benefit outweighs the cost.
Concurrency spikes can cause queries to queue. Instead of increasing warehouse size, configure a multi-cluster warehouse that automatically adds clusters when demand increases.
Best Practice:
This approach prevents unnecessary spending for oversized warehouses while ensuring smooth performance for multiple users.
Snowflake’s Query Profile tool provides a detailed execution plan that highlights:
Best Practice: Make it routine for engineers to analyse Query Profiles. This helps uncover whether queries are scanning too many partitions, performing unnecessary joins, or moving large amounts of data between nodes.
We're here to answer your questions and help you find the right solution.
"*" indicates required fields