Here's an article which begins by stating the obvious - Snowflake's Search Optimization Service (SOS, no pun intended - from me anyway, I can't speak for Snowflake!) can improve the speed of your searches. Be honest now, you'd never have guessed that from its name!
The SOS works on a table-by-table basis and has to be enabled in different ways, depending upon the data types you want to optimize. It's only available in Snowflake Enterprise Edition, and you'll need to perform maintenance at certain times, e.g. if you make changes to certain columns in a search-optimized table.
How does the SOS Work?
The SOS runs in the background (this is known as the maintenance service) and creates what Snowflake calls a search access path. This tracks the values held within table columns. This tracking allows the SOS to determine which micro-partitions can be ignored when retrieving data for a query.
When you enable the SOS on a table, the maintenance service has to build a search access path. How long this takes depends upon the table, but happily access to the table is not affected whilst the search access path is being built. The SOS doesn't become active until the search access path has been completely built. Some queries may run slower whilst the search access path is being built.
What Queries does the SOS Affect?
The SOS doesn't improve the performance of every query. It improves the performance of certain lookups and analytical queries.
- Queries with columns other than the primary cluster key
- Queries that usually run for a few seconds or longer before optimization is applied
- Queries with a filter operation on 100,000 distinct values or more
What Columns can I use with the SOS?
The Search Optimization Service works with columns utilising these data types:
- Fixed-point numerics, e.g. INT and NUMERIC
- DATE
- TIME
- TIMESTAMP
- VARCHAR
- BINARY
- VARIANT
- OBJECT
- ARRAY
- GEOGRAPHY
You can use the Search Optimization Service to improve the performance of views and queries that utilise joins. Search Optimization is applied to individual tables, so these operations pick up the query improvements by default.
Supported Predicate Types
Snowflake has some great documentation on the types of predicates the SOS supports.
- Lookup queries using equality and IN, which return a small number of rows
- Text search queries using the SEARCH and SEARCH_IP functions
- Substring and regular expression queries
- Semi-structured data queries
- Geospatial queries using GEOGRAPHY objects
- Queries using AND and OR
Unsupported Queries
As great as Search Optimization is, it doesn't support every scenario. External tables, materialized tables, and concatenated columns are a few of the query types that don't benefit from SO. Any query that uses Time Travel also fails to benefit from SO. SO only operates on active data.
Enabling the SOS
There are a number of different optimizations you can enable for the SOS. However, all of these are enabled or disabled using the command:
ALTER TABLE ... ADD SEARCH OPTIMIZATION
You will need OWNERSHIP privileges on the table or the ADD SEARCH OPTIMIZATION privilege on the schema that contains the table to use this command. You can enable the SOS for certain columns, or for the entire table.
Costs
As you might imagine, the SOS does not come for free! The creation of the search access path leads to more storage being used. According to Snowflake, this works out at roughly a quarter of the table's original size. More compute is also needed, initially for creating the search access path, but also for maintaining the search access path. Tables which have both the SOS and automatic clustering enabled can also increase compute costs.
Snowflake will bill in 1-second increments, and they provide a handy interface for viewing SOS costs.
There are some ways of reducing SOS costs. Deleting data less frequently can help, as it reduces the amount of compute needed to update the search access path. Similarly, batching DML statements like INSERT, UPDATE and MERGE can reduce compute cost too.
Summary
Snowflake's Search Optimization Service can help you improve the performance of key queries within your Snowflake warehouses. If you're keen on trying this technology out for yourself, make sure you have access to Snowflake Enterprise Edition, then take a look at Snowflake's Getting Started article. Hopefully you won't need to send a SOS to speed up your queries!