Lake Formation Demo
This page is intended for users who wish to test Lake Formation functionality but have not configured the requirements for the Lake Formation integration.
The following steps outline the process of configuring Dremio to work with an identity provider (IdP), setting up permissions in Lake Formation, and connecting to a AWS Glue source.
- Bootstrapping a Basic IdP
- Configuring Dremio's LDAP Connection
- Connecting the IdP to AWS with SAML
- Setting Permissions in Lake Formation
- Connecting Dremio to the Source
- Testing in Dremio
- Wrapping Up
This guide uses placeholders such as <password>
that must be replaced with values that are unique to your organization. Make sure to choose secure passwords.
1.0 Bootstrapping a Basic IdP
This section is intended for users who are not already using an IdP service that supports LDAP and SAML, such as Microsoft Entra ID. If you already have an IdP service in place, proceed to Configuring Dremio's LDAP Connection.
1.1 Creating a Virtual Machine (VM) for the IdP
Both Dremio and AWS need to communicate with the IdP server, so we recommend performing these steps on an externally accessible machine. A small cloud VM is a good option, such as an e2-medium
(2 vCPU, 4 GB RAM) Compute Engine instance from Google Cloud Platform (GCP).
Clone the Dremio Lake Formation Demo repository onto the VM.
1.2 Starting Docker Images
Start Docker using the provided docker-compose.yaml
file:
docker compose up -d
1.3 Bootstrapping OpenLDAP
Perform the following commands:
docker exec -it dremio-lake-formation-demo_openldap_1 bash
# Inside the container:
cd /bootstrap
./bootstrap.sh
exit
1.4: Synchronizing Keycloak Users with OpenLDAP
- Open your browser and enter the following URL for Keycloak:
http://<KEYCLOAK_IP_OR_HOSTNAME>:8080/auth
. - Click Administration Console.
- Log in using the Username
admin
and Password<password>-keycloak
. - In the Realm dropdown menu in the top left of the sidebar, select Master.
- Hover over Realm dropdown menu again and click Add realm.
- Name the realm
dremio
. - Click Create.
- Click User Federation in the sidebar.
- In the Add provider... dropdown menu, select ldap.
- Apply the following settings:
- Vendor:
Other
- Username LDAP Attribute:
cn
- Username LDAP attribute:
cn
- RDN LDAP attribute:
cn
- UUID LDAP attribute:
uid
- User Object Classes:
inetOrgPerson, organizationalPerson
- Connection URL:
ldap://openldap:1389
- Users DN:
ou=users,dc=example,dc=org
- Bind DN:
cn=admin,dc=example,dc=org
- Bind Credential:
<password>-ldap
- Vendor:
- Click Test connection and Test authentication to validate the settings.
- Click Save.
- Click Synchronize all users.
- Set periodic Sync Settings, if desired.
1.5 Synchronizing Keycloak Groups with OpenLDAP
- From the LDAP User Federation page, click the Mappers tab.
- Click Create.
- Apply the following settings:
- Name:
group-ldap-mapper
- Mapper Type:
group-ldap-mapper
- LDAP Groups DN:
ou=groups,dc=example,dc=org
- Group Name LDAP Attribute:
cn
- Group Object Classes:
groupOfNames
- Name:
- Click Save.
- Click Sync LDAP Groups to Keycloak.
2.0 Configuring Dremio's LDAP Connection
2.1 Stopping Dremio
Use the following command to stop the Dremio service:
bin/dremio stop
2.2 Editing dremio.conf
Add the following settings to dremio.conf
:
services.coordinator.web.auth.type: "ldap"
services.coordinator.web.auth.config: "ad.json"
The services.coordinator.web.auth.config
configuration property replaces services.coordinator.web.auth.ldap_config
, which is deprecated.
2.3 Creating the ad.json
File
Create a file named ad.json
and put it in the same directory as dremio.conf
. Copy and paste the following into the ad.json
file:
In Dremio 24+, bindPassword
can be encrypted using the dremio-admin encrypt
CLI command.
{
"connectionMode": "PLAIN",
"servers": [
{
"hostname": "<LDAP_IP_OR_HOSTNAME>",
"port": 1389
}
],
"names": {
"baseDN": "dc=example,dc=org",
"bindDN": "cn=admin,dc=example,dc=org",
"bindPassword": "changeme-ldap",
"userFilter": "(&(objectClass=inetOrgPerson))",
"userAttributes": {
"baseDNs": [
"ou=users,dc=example,dc=org"
],
"searchScope": "SUB_TREE",
"firstname": "cn",
"id": "cn",
"lastname": "sn",
"email": "cn"
},
"userGroupRelationship": "GROUP_ENTRY_LISTS_USERS",
"groupEntryListsUsers": {
"userEntryUserIdAttribute": "dn",
"groupEntryUserIdAttribute": "member"
},
"groupDNs": [
"CN={0},ou=groups,dc=example,dc=org"
],
"groupFilter": "(objectClass=groupOfNames)",
"autoAdminFirstUser": true
}
}
2.4 Verifying Dremio Logins
- Start the Dremio service:
bin/dremio start
- Open your browser and navigate to
http://<DREMIO_IP_OR_HOSTNAME>:8080
. - Log in as the admin with the Username
admin
and Password<password>-ldap
. - Log in as one or more users (
user00
throughuser99
) with the Usernameuser00
and Password<password>
.
The admin user has universal privileges in Dremio, whereas user accounts have only basic access.
3.0 Connecting the IdP to AWS with SAML
- Download the
descriptor.xml
metadata file fromhttp://<HOSTNAME_OF_KEYCLOAK>:8080/auth/realms/dremio/protocol/saml/descriptor
(or from your existing IdP). - Log in to the AWS Console.
- Open IAM Service.
- Click Identity Providers.
- Click Add provider.
- Use the default SAML type.
- Give the provider a name. Remember this value; it is used later in place of
<PROVIDER_NAME_IN_AWS>
). - Upload the
descriptor.xml
file. - Click Add provider.
4.0 Setting Permissions in Lake Formation
4.1 Creating a Table
If you don't already have tables set up in AWS Glue or Lake Formation, you may create one or more:
- In the AWS Console, open Lake Formation Service.
- Click Tables.
- Click Create table.
- Fill in settings as desired. If needed, create a database and S3 bucket.
4.2 Adding Permissions
- In the AWS Console, open Lake Formation Service.
- Click Data permissions.
- Click Grant.
- Apply the following settings:
- Principals:
SAML users and groups
- SAML and Amazon QuickSight users and groups:
arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:user/user00
ORarn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:group/group0
(Note:userX0
throughuserX9
are members ofgroupX
forX = [0,9]
) - LF-Tags or catalog resources:
Named data catalog resources
- Databases:
<DATABASE_NAME>
- Tables:
<TABLE_NAME>
- Table and column permissions:
Select
and/orSuper
(all)
- Principals:
5.0 Connecting Dremio to the Source
- Open your browser and navigate to Dremio:
http://<DREMIO_IP_OR_HOSTNAME>:8080
. - Click the + button next to Data Lakes.
- Click Amazon Glue Catalog.
- Fill out the General tab, including Name and Authentication
- Select the Advanced Options tab and complete the following:
- Enable Enforce AWS Lake Formation access permissions on datasets.
- Fill in the user and group prefix settings per Lake Formation Permissions Reference. For this demo, use SAML:
- User prefix:
arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:user/
- Group prefix:
arn:aws:iam::<AWS_ACCOUNT_ID>:saml-provider/<PROVIDER_NAME_IN_AWS>:group/
- User prefix:
- Under the Privileges tab, you may optionally enable Select privileges for All Users to allow other users (not just the admin account) to access the AWS Glue source.
- Click Save.
6.0 Testing in Dremio
- Open your browser and navigate to Dremio:
http://<DREMIO_IP_OR_HOSTNAME>:8080
. - Log in as admin or one of the user accounts (
user00
throughuser99
)
userX0
through userX9
are members of groupX
for X = [0,9]
- Click the AWS Glue source that you added previously.
- Explore the table(s) available, making sure to note that Lake Formation permissions are enforced when tables are accessed or queried.
- Log in as other users to test permissions.
7.0 Wrapping Up
After you completed your tests of Dremio's functionality with Lake Formation, shut down or delete the VM that is running your IdP service to avoid additional uptime charges.