Snowflake recommends that roles are used for authorizing access to database objects instead of allowing direct access to objects by individual users. Roles may be granted to other roles, and this enables the Snowflake administrator to create access control hierarchies that act as building blocks for creating an overall access control strategy.
There is some excellent information out there about Snowflake Role Based Access Control or RBAC for short that can be used as a starting point to learn the basics, such as this document on Snowflake Community and a series of posts written by John Ryan here, here and here.
In this post I want to summarize the concepts of role based access control and then point out some additional considerations when implementing it.
Here is a summary of best practices when setting up role based access control (RBAC):
- Define a set of functional roles that will be granted to users according to how they will be using the database, for example data analyst, data engineer, etc.
- Define a set of access roles with a common set of privileges, for example full access to a given schema, or read-only access to a given schema
- Always grant access roles to functional roles and then grant functional roles to users (do not grant access roles to other access roles or access roles to users)
- Use managed access schemas to prevent object owners from granting access to other roles at their discretion and to centralize grant management, then define future grants on all object types in the schema to simplify grant management (meaning that all newly created objects in the schema will receive the same grants to roles as currently defined)
- Additional roles are needed for users who will do cross-schema and cross-database administration tasks, such as cloning all tables from the development environment to the test environment and similar
To illustrate better, consider the following example as depicted in the diagram below:
We have defined four functional roles
- Data loader, loads data that is extracted from source systems
- Data engineer, performs data transformations
- Data analyst, analyzes data in all schemas
- BI user, reads data from the data warehouse and data marts
In the development database, designated by DEV, we have defined some schemas:
- EX, data that is extracted from source schemas
- STG, data that is staged to be loaded into the data warehouse
- DWH, data warehouse
- BI, data marts
- WRK, schema that can be used as a sandbox for temporary work
Each schema has two corresponding access roles, one that allows read only access to all objects in the schema and one that allows full access to all objects in the schema.
Each functional role is assigned the corresponding access role based on their requirements:
- Data loader has full access to the EX schema because they write data that has been extracted from source systems to this schema
- Data engineer has full access to all schemas because they perform data transformations between the different schemas
- Data analyst has read only access to all schemas because they analyze data at all stages of the ETL process
- BI user has read only access to the DWH and BI schemas because they create reports from the data warehouse and the data marts
- Full access to the WRK schema has been granted to the data engineer and data analyst functional roles because they are the most intensive users and typically require a sandbox to create temporary objects
Cross-database access
The previous example shows the roles that are defined in the development database. Similar roles would be created in other databases such as test, UAT, production and so on.
To allow cross-database access, for example to clone tables from the development database to the test database, an additional role is needed that has access to both databases. A feasible solution is to create a role named DATA_ENGINEER (without the database prefix) and grant functional roles from each of the databases to this new role as depicted on the diagram below:
While it may be tempting to simply grant roles DEV_DATA_ENGINEER and TST_DATA_ENGINEER to the new role DATA_ENGINEER to save time, it is recommended to rather grant the individual schema access roles to the new role to avoid a role hierarchy, which may, over time, when additional roles are added, become increasingly more difficult to manage.
Grants on object types
When granting future privileges on object types in managed access schemas, be sure to check which object types are available and include those that will be used in the schema. As Snowflake is constantly evolving, new object types may appear, and they may be missed when defining future grants. At the time of writing this post (November 2021), the following object types are available as per Snowflake documentation:
TABLE | EXTERNAL TABLE | VIEW | MATERIALIZED VIEW | MASKING POLICY | ROW ACCESS POLICY | SEQUENCE | FUNCTION | PROCEDURE | FILE FORMAT | STAGE | PIPE | STREAM | TASK
Views that read from tables in other schemas
Snowflake documentation states that “A user who has SELECT privilege on a view does not also need SELECT privilege on the tables that the view uses.” Technically, this means that users who have access to a given schema are able to view data from views that select data from a different schema to which the user doesn’t have access.
In RBAC, an additional step is needed to allow users to select data from views that use tables from a different schema. Because all schemas are created with managed access, all objects in each schema are owned by the corresponding access role. Additional access must be granted to the role that owns the view to allow users to select data from tables in a different schema. This is better illustrated with an example:
- In the development environment DEV, in schema STG we have a staging table named STG.CUSTOMER_STAGE. This table is owned by the access role DEV_STG_FULL.
- In schema DWH we have a view named DWH.V_CUSTOMER which selects from STG.CUSTOMER_STAGE. This view is owned by the access role DEV_DWH_FULL.
- A user with the role DEV_BI_USER has been granted access to the DWH schema but not to the STG schema. When the user selects data from the view DWH.V_CUSTOMER, they will get an error stating that the view cannot be expanded because schema STG does not exist or not authorized.
- In RBAC, the role that owns the view, in our case this is DEV_DWH_FULL must be granted select privilege on schema STG where the table that is used in the view resides:
grant role DEV_STG_READ to role DEV_DWH_FULL;
After this grant is applied, the user can successfully select data from any view in the schema DWH that selects data from any table in the schema STG. Additional grants are needed when views select data from tables in other schemas.
Summary
Setting up RBAC usually requires careful up-front planning and design of the roles and database objects that will be used. Scripts can then be written to create the objects and roles in each environment, paying attention to include all the steps as outlined at the beginning. Although the scripts can become quite elaborate, after successfully setting up all the objects and roles, it becomes straightforward to maintain. When new users are added, they are assigned the respective functional role. When a new functional role is needed, it is created and the respective access roles granted. When done correctly, RBAC is a well organized approach to manage user access and privileges in Snowflake.