Integrating with Lake Formation (Preview)

Lake Formation provides access controls for datasets in Glue and is used to define security policies from a centralized location that may be shared across multiple tools. Dremio may be configured to refer to this service to verify access for a user to contained datasets.

The following sections are available:

Requirements

Current Limitations

The following functionality is currently supported with Dremio:

  • Database- and table-level access permissions (row- and column-level permissions are not yet supported)

Lake Formation Workflow

When Lake Formation is properly configured, Dremio adheres to the following workflow each time an end user attempts to access, edit, or query datasets with managed privileges:

  1. In all cases, Dremio access controls is enforced. See Configuring Sources for Lake Formation below for access control recommendations.
  2. Dremio checks each PDS to determine if those stored in the Glue source are configured to use Lake Formation for security.
  3. If one or more datasets leverage Lake Formation, Dremio determines the user ARNs to use when checking against Lake Formation.
  4. Dremio queries Lake Formation to determine a user’s access level to the datasets using the user/group ARNs.
  5. If the user has access to the datasets specified within the query’s scope, the query proceeds. If the user lacks access, the query will fail with a permission error.

Demoing Lake Formation

Both demo files and a walkthrough are available to help you test Lake Formation functionality. This demo is intended for customers that do not have all of the requirements listed above preconfigured.

Configuring Sources for Lake Formation

Lake Formation integration is dependent on the mapping of user/group names in Dremio to the IAM user/group ARNs used by AWS.

To configure an existing or new Glue source, you must set the following options:

  1. From your existing source or upon creating an Amazon Glue Catalog source, navigate to the Advanced Options tab.
  2. Enable Enforce AWS Lake Formation access permissions on datasets.
  3. Fill in the user and group prefix settings as instructed with the Lake Formation Permissions Reference. For example, if you are using a SAML provider in AWS:
    • User prefix with SAML: arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:user/
    • Group prefix with SAML: arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:group/

Best Practice:

From the Privileges tab, we recommend enabling the Select privilege for All Users, as this will allow non-admin users to access this source from Dremio.

  1. Click Save.