Skip to main content
Version: current [25.0.x]

Lake Formation Demo

This page is intended for users who wish to test Lake Formation functionality but have not configured the requirements for the Lake Formation integration.

The following steps outline the process of configuring Dremio to work with an identity provider (IdP), setting up permissions in Lake Formation, and connecting to a AWS Glue source.

  1. Bootstrapping a Basic IdP
    1. Creating a Virtual Machine (VM) for the IdP
    2. Starting Docker Images
    3. Bootstrapping OpenLDAP
    4. Synchronizing Keycloak Users with OpenLDAP
    5. Synchronizing Keycloak Groups with OpenLDAP
  2. Configuring Dremio's LDAP Connection
    1. Stopping Dremio
    2. Editing dremio.conf
    3. Creating the ad.json File
    4. Verifying Dremio Logins
  3. Connecting the IdP to AWS with SAML
  4. Setting Permissions in Lake Formation
    1. Creating a Table
    2. Adding Permissions
  5. Connecting Dremio to the Source
  6. Testing in Dremio
  7. Wrapping Up
note

This guide uses placeholders such as <password> that must be replaced with values that are unique to your organization. Make sure to choose secure passwords.

1.0 Bootstrapping a Basic IdP

note

This section is intended for users who are not already using an IdP service that supports LDAP and SAML, such as Azure Active Directory (Azure AD). If you already have an IdP service in place, proceed to Configuring Dremio's LDAP Connection.

1.1 Creating a Virtual Machine (VM) for the IdP

Both Dremio and AWS need to communicate with the IdP server, so we recommend performing these steps on an externally accessible machine. A small cloud VM is a good option, such as an e2-medium (2 vCPU, 4 GB RAM) Compute Engine instance from Google Cloud Platform (GCP).

Clone the Dremio Lake Formation Demo repository onto the VM.

1.2 Starting Docker Images

Start Docker using the provided docker-compose.yaml file:

docker compose up -d

1.3 Bootstrapping OpenLDAP

Perform the following commands:

docker exec -it dremio-lake-formation-demo_openldap_1 bash
# Inside the container:
cd /bootstrap
./bootstrap.sh
exit

1.4: Synchronizing Keycloak Users with OpenLDAP

  1. Open your browser and enter the following URL for Keycloak: http://<KEYCLOAK_IP_OR_HOSTNAME>:8080/auth.
  2. Click Administration Console.
  3. Log in using the Username admin and Password <password>-keycloak.
  4. In the Realm dropdown menu in the top left of the sidebar, select Master.
  5. Hover over Realm dropdown menu again and click Add realm.
  6. Name the realm dremio.
  7. Click Create.
  8. Click User Federation in the sidebar.
  9. In the Add provider... dropdown menu, select ldap.
  10. Apply the following settings:
    • Vendor: Other
    • Username LDAP Attribute: cn
    • Username LDAP attribute: cn
    • RDN LDAP attribute: cn
    • UUID LDAP attribute: uid
    • User Object Classes: inetOrgPerson, organizationalPerson
    • Connection URL: ldap://openldap:1389
    • Users DN: ou=users,dc=example,dc=org
    • Bind DN: cn=admin,dc=example,dc=org
    • Bind Credential: <password>-ldap
  11. Click Test connection and Test authentication to validate the settings.
  12. Click Save.
  13. Click Synchronize all users.
  14. Set periodic Sync Settings, if desired.

1.5 Synchronizing Keycloak Groups with OpenLDAP

  1. From the LDAP User Federation page, click the Mappers tab.
  2. Click Create.
  3. Apply the following settings:
    • Name: group-ldap-mapper
    • Mapper Type: group-ldap-mapper
    • LDAP Groups DN: ou=groups,dc=example,dc=org
    • Group Name LDAP Attribute: cn
    • Group Object Classes: groupOfNames
  4. Click Save.
  5. Click Sync LDAP Groups to Keycloak.

2.0 Configuring Dremio's LDAP Connection

2.1 Stopping Dremio

Use the following command to stop the Dremio service:

bin/dremio stop

2.2 Editing dremio.conf

Add the following settings to dremio.conf:

services.coordinator.web.auth.type: "ldap"
services.coordinator.web.auth.config: "ad.json"
note

The services.coordinator.web.auth.config configuration property replaces services.coordinator.web.auth.ldap_config, which is deprecated.

2.3 Creating the ad.json File

Create a file named ad.json and put it in the same directory as dremio.conf. Copy and paste the following into the ad.json file:

note

In Dremio 24+, bindPassword can be encrypted using the dremio-admin encrypt CLI command.

{
"connectionMode": "PLAIN",
"servers": [
{
"hostname": "<LDAP_IP_OR_HOSTNAME>",
"port": 1389
}
],
"names": {
"baseDN": "dc=example,dc=org",
"bindDN": "cn=admin,dc=example,dc=org",
"bindPassword": "changeme-ldap",
"userFilter": "(&(objectClass=inetOrgPerson))",
"userAttributes": {
"baseDNs": [
"ou=users,dc=example,dc=org"
],
"searchScope": "SUB_TREE",
"firstname": "cn",
"id": "cn",
"lastname": "sn",
"email": "cn"
},
"userGroupRelationship": "GROUP_ENTRY_LISTS_USERS",
"groupEntryListsUsers": {
"userEntryUserIdAttribute": "dn",
"groupEntryUserIdAttribute": "member"
},
"groupDNs": [
"CN={0},ou=groups,dc=example,dc=org"
],
"groupFilter": "(objectClass=groupOfNames)",
"autoAdminFirstUser": true
}
}

2.4 Verifying Dremio Logins

  1. Start the Dremio service:
    bin/dremio start
  2. Open your browser and navigate to http://<DREMIO_IP_OR_HOSTNAME>:8080.
  3. Log in as the admin with the Username admin and Password <password>-ldap.
  4. Log in as one or more users (user00 through user99) with the Username user00 and Password <password>.

The admin user has universal privileges in Dremio, whereas user accounts have only basic access.

3.0 Connecting the IdP to AWS with SAML

  1. Download the descriptor.xml metadata file from http://<HOSTNAME_OF_KEYCLOAK>:8080/auth/realms/dremio/protocol/saml/descriptor (or from your existing IdP).
  2. Log in to the AWS Console.
  3. Open IAM Service.
  4. Click Identity Providers.
  5. Click Add provider.
  6. Use the default SAML type.
  7. Give the provider a name. Remember this value; it is used later in place of <PROVIDER_NAME_IN_AWS>).
  8. Upload the descriptor.xml file.
  9. Click Add provider.

4.0 Setting Permissions in Lake Formation

4.1 Creating a Table

If you don't already have tables set up in AWS Glue or Lake Formation, you may create one or more:

  1. In the AWS Console, open Lake Formation Service.
  2. Click Tables.
  3. Click Create table.
  4. Fill in settings as desired. If needed, create a database and S3 bucket.

4.2 Adding Permissions

  1. In the AWS Console, open Lake Formation Service.
  2. Click Data permissions.
  3. Click Grant.
  4. Apply the following settings:
    • Principals: SAML users and groups
    • SAML and Amazon QuickSight users and groups: arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:user/user00 OR arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:group/group0 (Note: userX0 through userX9 are members of groupX for X = [0,9])
    • LF-Tags or catalog resources: Named data catalog resources
    • Databases: <DATABASE_NAME>
    • Tables: <TABLE_NAME>
    • Table and column permissions: Select and/or Super (all)

5.0 Connecting Dremio to the Source

  1. Open your browser and navigate to Dremio: http://<DREMIO_IP_OR_HOSTNAME>:8080.
  2. Click the + button next to Data Lakes.
  3. Click Amazon Glue Catalog.
  4. Fill out the General tab, including Name and Authentication
  5. Select the Advanced Options tab and complete the following:
    1. Enable Enforce AWS Lake Formation access permissions on datasets.
    2. Fill in the user and group prefix settings per Lake Formation Permissions Reference. For this demo, use SAML:
      • User prefix: arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:user/
      • Group prefix: arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:group/
  6. Under the Privileges tab, you may optionally enable Select privileges for All Users to allow other users (not just the admin account) to access the AWS Glue source.
  7. Click Save.

6.0 Testing in Dremio

  1. Open your browser and navigate to Dremio: http://<DREMIO_IP_OR_HOSTNAME>:8080.
  2. Log in as admin or one of the user accounts (user00 through user99)
    note

    userX0 through userX9 are members of groupX for X = [0,9]

  3. Click the AWS Glue source that you added previously.
  4. Explore the table(s) available, making sure to note that Lake Formation permissions are enforced when tables are accessed or queried.
  5. Log in as other users to test permissions.

7.0 Wrapping Up

After you completed your tests of Dremio's functionality with Lake Formation, shut down or delete the VM that is running your IdP service to avoid additional uptime charges.