Articles

A Simple and Scalable Role Setup in Snowflake

As Snowflake grows in popularity, many people are looking to try it but are unsure where to start. Here is a starter setup that is simple enough to get you up and running quickly yet can easily be adapted to any situation if you decide to adopt Snowflake.

Role-based Access Control (RBAC) is a now common methodology for implementing access controls and security in many modern systems and Snowflake does an excellent job of that. In fact, role is a key entity in how Snowflake implements controls and security.

In short, roles are granted privileges and users are granted roles based on need. The user, while logging in to a session, assumes a role that they want to have in effect for that session. The user can also switch roles within a session.

Roles in Snowflake not only have access to data but also to other resources, mainly virtual warehouses (compute). This makes administering user access a pleasure in Snowflake as you need only assign the right role to the user and everything else should fall in place. This is especially handy when onboarding new users to the project. But this can also cause confusion as things can become nuanced very quickly. For example, a power user may have the same access to the data as a regular user but may be granted additional privileges for scaling the warehouse.

This setup should help you get started without getting weighed down by those nuances.

Let’s look at the setup in detail. Consider a project with a database at the core, with diverse types of users accessing the data. ETL/ELT jobs are run periodically to refresh/update data in the database, and all of this is accomplished via the many compute warehouses.

A diagram showing different role setups in Snowflake.

Role

Read-Only: For end users and downstream systems and applications.

This role is designed to have the fewest privileges and is expected to be granted to a wider audience. This role can be further separated based on user group/function and downstream application type to scale things up. On the user warehouse, it has USAGE, OPERATE and MONITOR privileges. The users are not granted the ability to scale the warehouse to keep costs under control. On the database, USAGE and MONITOR privileges are granted.

Read-Write: For applications (such as web applications), ETL systems.

Similarly, this role can be further split up based on usage. In addition to the USAGE, OPERATE and MONITOR, this role also has the MODIFY privilege on the warehouse. This is to enable the ETL job or the application to scale up or down accordingly to process the workload, helping to achieve an optimal cost-performance balance. The database privileges are similar to the read-only role.

DB Owner: For schema and infrastructure management systems (such as Flyway, Terraform).

This role should be granted to the fewest users as it has the most privileges. The ideal use case would be to grant this role to the deployment account only.

Database

We recommend setting the owner to SYSADMIN.

Warehouse

This approach suggests the use of separate warehouses for each workload as opposed to shared warehouses to ensure better performance and avoid bottlenecks. With Snowflake, there is no limit on the number of warehouses that can be created, and the billing is based only on usage.

Resource Monitor

Resource monitors are currently the only way to monitor costs at the warehouse level automatically. Resource monitors are quite useful albeit with a few minor annoyances. Resource monitors can be configured at the Account or a virtual warehouse level to monitor credit usage and send alerts when certain thresholds are reached. Multiple thresholds can be set on a single monitor and, if needed, a compute warehouse can also be suspended if a threshold is reached.

In this approach, a low threshold is set for the user warehouse with the option to suspend it when the threshold is reached. Note that Notify & Suspend is preferred over Notify & Suspend Immediately to allow for user queries to be completed.

A similarly low threshold is set for the admin warehouse, as it is not expected to get heavy usage. However, it is set to Notify when the threshold is reached. The app warehouse is expected to get the most usage, so the thresholds are set higher with the Notify option as above.

Additionally, an account level monitor is recommended to help stay ahead of any upward trends in usage, avoiding surprises at the end of the invoicing period.

In a future article, we will cover the implementation of this setup using Terraform.