Configuring Your Values to Deploy Dremio to Kubernetes
Helm is a standard for managing Kubernetes applications, and the Helm chart defines how applications are deployed to Kubernetes. Dremio's Helm chart contains the default deployment configurations, which are specified in the values.yaml
.
Dremio recommends configuring your deployment values in a separate .yaml
file since it will allow simpler updates to the latest version of the Helm chart by copying the separate configuration file across Helm chart updates.
Configuring Your Values
To configure your deployment values, do the following:
-
Download the file
values-overrides.yaml
and save it locally.The
values-overrides.yaml
configuration file
# A Dremio License is required
dremio:
license: "<your license key>"
tag: 26.0.0
# To pull images from Dremio's Quay you must create a image pull secret. For more info see:
# https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
# All of the images are pulled using this same secret.
imagePullSecrets:
- <your-pull-secret-name>
coordinator:
auth:
type: "internal"
client:
tls:
enabled: false
secret: "<your-tls-secret-name>"
flight:
tls:
enabled: false
secret: "<your-tls-secret-name>"
web:
tls:
enabled: false
secret: "<your-tls-secret-name>"
volumeSize: 512Gi
resources:
limits:
memory: 64Gi
requests:
cpu: 16
memory: 60Gi
# Where Dremio stores metadata, reflections, and uploaded files.
# For more information, see https://docs.dremio.com/current/what-is-dremio/architecture#distributed-storage
distStorage:
# The supported distributed storage types are: aws, gcp, or azureStorage. For S3-compatible storge use aws.
type: <your-distributed-storage-type> # Add here your distributed storage template from http://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-the-distributed-storage
catalog:
externalAccess:
enabled: true
tls:
enabled: false
secret: "<your-catalog-tls-secret-name>"
# This is where Iceberg tables created in your catalog will reside
storage:
# The supported catalog storage types are: S3 or azure. For S3-compatible storge use S3.
type: <your-catalog-storage-type> # Add here your catalog storage template from http://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-the-catalog-storage
service:
type: LoadBalancer -
Edit the
values-overrides.yaml
file to configure your values. See the following sections for details on each configuration option: -
Save the
values-overrides.yaml
file.
Once done with the configuration, deploy Dremio to Kubernetes. See how in Deploying Dremio to Kubernetes.
License
Provide your license key. To obtain a license, see Licensing.
Perform this configuration in this section of the file:
dremio:
license: ...
Pull Secret
Provide the secret used to pull the images from Quay.io. To create the Kubernetes secret, use this example:
Properties for Kubernetes secretkubectl create secret docker-registry dremio-docker-secret --docker-username=your_username --docker-password=your_password_for_username --docker-email=DOCKER_EMAIL
For more information, see Create a Secret by providing credentials on the command line (the Docker registry is quay.io
). All of the images are pulled using this same secret.
Pods can only reference image pull secrets in their own namespace, so this process needs to be done on the namespace where Dremio is being deployed.
Perform this configuration in this section of the file:
imagePullSecrets:
- ...
Coordinator
-
Configure the volume size, resources limits, and resources requests. To configure these values, see the section Recommended Resources Configuration.
Perform this configuration in this section of the file:coordinator:
resources:
requests:
cpu: ...
memory: ...
volumeSize: ... -
(Optional) Configure Authentication via an Identity Provider, including LDAP, Microsoft Entra ID, or generic Open ID providers). This requires an additional config file provided during Dremio's deployment. See our Identity Provider documentation for instructions on how to create the required config file for your auth type. Possible types include:
azuread
,ldap
,oauth
, oroauth+ldap
. Perform this configuration in this section of the file:coordinator:
auth:
type: ... -
(Optional) Enable TLS (set
enabled: true
) and provide the TLS secret. See the section Creating a TLS Secret.
Perform this configuration in this section of the file:coordinator:
client:
tls:
enabled: ...
secret: ...
flight:
tls:
enabled: ...
secret: ...
web:
tls:
enabled: ...
secret: ...noteIf Web TLS is enabled, see the section Configuring Dremio Catalog when Coordinator Web is Using TLS.
Coordinator's Distributed Storage
This is where Dremio stores metadata, reflections, and uploaded files. To configure these values, see the section Configuring the Distributed Storage.
Perform this configuration in this section of the file:
distStorage:
type: ...
Dremio Catalog
- Configuring storage for Dremio Catalog is mandatory since this is the location where Iceberg tables created in the Catalog will be written. For configuring the storage, see the section Configuring Storage for Dremio Catalog.
Perform this configuration in this section of the file:catalog:
externalAccess:
enabled: ... - (Optional) Use TLS for external access: clients connecting to Dremio Catalog from outside the namespace will be required to use TLS. To configure it, see the section Configuring TLS for Dremio Catalog External Access.
Perform this configuration in this section of the file:catalog:
externalAccess:
enabled: ...
tls:
enabled: ...
secret: ... - (Optional) If Dremio coordinator Web access is using TLS, additional configuration is necessary. To configure it, see the section Configuring Dremio Catalog When Coordinator Web Is Using TLS.
Perform this configuration in this section of the file:catalog:
externalAccess:
enabled: ...
authentication:
authServerHostname: ...
Save the values-overrides.yaml
file.
Once done with the configuration, deploy Dremio to Kubernetes. See how in Deploying Dremio to Kubernetes.
Configuring Your Values - Advanced
Dremio Platform Images
The Dremio platform requires 17 images when running fully featured. All images are published by Dremio to our Quay and are listed below. If you want to use a private mirror of our repository, add the snippets bellow to values-overrides.yaml
to repoint to your own.
Dremio Platform Images
If creating a private mirror, use the same repository names and tags from Dremio's Quay.io.
This is important for supportability.
dremio:
image:
repository: .../dremio-ee
tag: <The image tag from Quay.io>
busyBox:
image:
repository: .../busybox
tag: <The image tag from Quay.io>
k8s:
image:
repository: .../alpine/k8s
tag: <The image tag from Quay.io>
engine:
operator:
image:
repository: .../dremio-engine-operator
tag: <The image tag from Quay.io>
zookeeper:
image:
repository: .../zookeeper
tag: <The image tag from Quay.io>
opensearch:
image:
repository: .../dremio-search-opensearch
tag: <The image tag from Quay.io> # The tag version must be a valid opensearch version as listed here https://opensearch.org/docs/latest/version-history/
preInstallJob:
image:
repository: .../dremio-search-init
tag: <The image tag from Quay.io>
opensearchOperator:
manager:
image:
repository: .../dremio-opensearch-operator
tag: <The image tag from Quay.io>
kubeRbacProxy:
image:
repository: .../kube-rbac-proxy
tag: <The image tag from Quay.io>
mongodbOperator:
image:
repository: .../dremio-mongodb-operator
tag: <The image tag from Quay.io>
mongodb:
image:
repository: .../percona-server-mongodb
tag: <The image tag from Quay.io>
metrics:
image:
repository: .../mongodb_exporter
tag: <The image tag from Quay.io>
catalogservices:
image:
repository: .../dremio-ee-catalog-services-server
tag: <The image tag from Quay.io>
catalog:
image:
repository: .../dremio-catalog-server
tag: <The image tag from Quay.io>
externaAccess:
image:
repository: .../dremio-catalog-server-external
tag: <The image tag from Quay.io>
nats:
container:
image:
repository: .../nats
tag: <The image tag from Quay.io>
telemetry:
image:
repository: .../opentelemetry-collector-contrib
tag: <The image tag from Quay.io>
Scale-out Coordinators
Dremio can scale to support high concurrency use cases through scaling coordinators. Multiple stateless coordinators rely on the primary coordinator to manage Dremio's state, enabling Dremio to support many more concurrent users. These scale-out coordinators are intended for high query throughput and are not applicable for standby or disaster recovery. While scale-out coordinators generally reduce the load on the primary coordinator, the primary coordinator's vCPU request should be increased for every two scale-outs added to avoid negatively impacting performance.
Perform this configuration in this section of the file, where count refers to the number of scale-outs. A count of 0 will provision only the primary coordinator:
coordinator:
count: ...
When using scale-out coordinators, the load balancer session affinity should be enhanced. See: Advanced Load Balancer Configuration.
Configuring Kubernetes Pod Metadata (Including Node Selector)
It's also possible to add metadata to each of the statefulsets (coordinators, classic engines, ZooKeeper, etc.). This includes configuring a node selector, which configures pods to use specific node pools (AKS node pools and EKS node groups). The following metadata can be added:
annotations: {}
podAnnotations: {}
labels: {}
podLabels: {}
nodeSelector: {}
tolerations: []
Example of a coordinator node selector, where the node pool is named coordinatorpool
:
coordinator:
nodeSelector:
agentpool: coordinatorpool
Advanced Load Balancer Configuration
Dremio will create a public load balancer by default, and the Dremio Client service will provide an external IP to connect to Dremio. For more information, see Connecting to the Dremio Console.
-
Private Cluster - For private Kubernetes clusters (no public endpoint), set
internalLoadBalancer: true
Perform this configuration in this section of the file:service:
type: ...
internalLoadBalancer: ... -
Static IP - To define a static IP for your load balancer, set
loadBalancerIP: <your-static-IP>
. If unset, an available IP will be assigned upon creation of the load balancer. Perform this configuration in this section of the file:service:
type: ...
loadBalancerIP: ...noteThis can be helpful if DNS is configured to expect Dremio to have a specific IP.
-
Session Afinity - If leveraging Scale-out Coordinators, set
sessionAffinity: true
. Perform this configuration in this section of the file:service:
type: ...
sessionAffinity: ...
Advanced TLS Configuration for OpenSearch
Dremio generates TLS certificates by default for OpenSearch and they are rotated monthly. However, if you want to have your own, you need to create two secrets containing the relevant certificates. The format of the secrets is different from the other TLS secrets shown on this page, and the tls.crt
, tls.key
, and ca.crt
files must be in PEM format. Use the example below as reference to create your secrets:
kubectl create secret generic opensearch-tls-certs \
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt
kubectl create secret generic opensearch-tls-certs-admin \
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt
Add the snippet below to the values-overrides.yaml
file before deploying Dremio. Note that the second certificate is not configured but is required for OpenSearch to work.
opensearch:
tlsCertsSecretName: opensearch-tls-certs
disableTlsCertGeneration: true
Advanced Configuration of Engines
Dremio's default resource offset is reserve-2-8
, where the first value represents 2 vCPUs and the second represents 8 GB of RAM. If you need to change this default for your created engines, add the following snippet to values-overrides.yaml
and set the defaultOffset
to one of the configurable offsets listed below, which are available out of the box:
reserve-0-0
reserve-2-4
reserve-2-8
reserve-2-16
The listed values are keys and thus must be provided in this exact format into the snippet below.
engine:
options:
resourceAllocationOffsets:
defaultOffset: <key from list above>
Configuration of Classic Engines
- You should only use classic engines if the new engines (as of Dremio 26.0) are not appropriate for your use case.
- Classic engines will not auto-start/auto-stop, which is only possible with the new engines.
The classic way of configuring engines is still supported, and you can add this snippet to values-overrides.yaml
as part of the deployment. Note that this snippet is a configuration example, and you should adjust the values to your own case.
executor:
resources:
requests:
cpu: "16"
memory: "120Gi"
limits:
memory: "120Gi"
engines: ["default"]
count: 3
volumeSize: 128Gi
cloudCache:
enabled: true
volumes:
- size: 128Gi
References
Recommended Resources Configuration
The table in this section contains the recommended values for resources requests and volume size to configure Dremio components. In the values-overrides.yaml
file, set the following values:
resources:
requests:
memory: # Put here the first value in the table column.
cpu: # Put here the second value in the table column.
volumeSize: # Put here the third value in the table column, if any.
Dremio recommends using the Basic Configuration values for evaluation or testing purposes and adjusting them as you go towards the values in Production Configuration, which are the values Dremio recommends to operate in a production environment.
Dremio Component | Basic Configuration | Production Configuration | Pod Count |
---|---|---|---|
Coordinator | 8Gi, 4, 50Gi | 64Gi, 32, 512Gi | 1 |
Catalog Server | 8Gi, 4 | 8Gi, 4 | 1 |
Catalog Server (External) | 8Gi, 4 | 8Gi, 4 | 1 |
Catalog Service Server | 8Gi, 4 | 8Gi, 4 | 1 |
Engine Operator | 1Gi, 1 | 1Gi, 1 | 1 |
OpenSearch | 8Gi, 1500m, 10Gi | 16Gi, 2, 100Gi | 3 |
MongoDB | 2Gi, 4, 50Gi | 4Gi, 8, 512Gi | 3 |
NATS | 1Gi, 700m | 1Gi, 700m | 3 |
ZooKeeper | 1Gi, 500m | 1Gi, 500m | 3 |
Open Telemetry | 1Gi, 1 | 1Gi, 1 | 1 |
Expand this section below for Dremio platform components resource YAML snippets:
Dremio Platform Resource Configuration YAML
coordinator:
resources:
requests:
cpu: "32"
memory: "64Gi"
limits:
memory: "64Gi"
volumeSize: "512Gi"
zookeeper:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
memory: "1Gi"
volumeSize: "10Gi"
catalog:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
catalogservices:
resources:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
mongodb:
resources:
requests:
cpu: "2"
memory: "2Gi"
limits:
cpu: "4"
memory: "2Gi"
storage:
resources:
requests:
storage: "512Gi"
opensearch:
resources:
requests:
memory: "16Gi"
cpu: "2"
limits:
memory: "16Gi"
cpu: "2"
nats:
resources:
requests:
cpu: "500m"
memory: "1024Mi"
limits:
cpu: "750m"
memory: "1536Mi"
telemetry:
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "2"
memory: "2Gi"
Creating a TLS Secret
If you have enabled TLS in your values-overrides.yaml
, the corresponding secrets must be created before deploying Dremio. To create a secret, run the following command:
kubectl create secret tls <your-tls-secret-name> --key privkey.pem --cert cert.pem
For more information, see kubectl create secret tls.
TLS for OpenSearch requires a secret of a different makeup. See Advanced TLS Configuration for OpenSearch.
Configuring the Distributed Storage
Dremio’s distributed store uses scalable and fault-tolerant storage and it is configured as follows:
-
In the
values-overrides.yaml
file, find the section withdistStorage:
andtype:
distStorage:
type: ... -
In
type:
, configure your storage provider with one of the following values:"gcp"
- For GCP Cloud Storage."aws"
- For AWS S3 or S3-compatible storage."azureStorage"
- For Azure Storage.
-
Select the tab below for the storage provider you have configured in step 2, copy the template, paste it below the line with
type:
, and configure your distributed storage values.
- Google Cloud Platform (GCP)
- AWS S3
- Azure Storage
# Google Cloud Storage
#
# bucketName: The name of the GCS bucket for distributed storage.
# path: The path, relative to the bucket, to create Dremio's directories.
# authentication: Valid types are: serviceAccountKeys or auto.
# - When using "auto" authentication, Dremio uses Google Application Default Credentials to
# authenticate. This is platform-dependent and may not be available in all Kubernetes clusters.
# - Note: When using a GCS bucket on GKE, we recommend enabling Workload Identity and configuring
# a Kubernetes Service Accountfor Dremio with an associated workload identity that
# has access to the GCS bucket.
# credentials: If using serviceAccountKeys authentication, uncomment the credentials section below.
gcp:
bucketName: "GCS Bucket Name"
path: "/"
authentication: "auto"
# If using serviceAccountKeys, uncomment the section below, referencing the values from
# the service account credentials JSON file that you generated:
#
#credentials:
# projectId: GCP Project ID that the Google Cloud Storage bucket belongs to.
# clientId: Client ID for the service account that has access to the Google Cloud Storage bucket.
# clientEmail: Email for the service account that has access to the Google Cloud Storage bucket.
# privateKeyId: Private key ID for the service account that has access to Google Cloud Storage bucket.
# privateKey: |-
# -----BEGIN PRIVATE KEY-----\n Replace me with full private key value. \n-----END PRIVATE KEY-----\n
# Extra Properties
# Use the extra properties block to provide additional parameters to configure the distributed
# storage in the generated core-site.xml file.
#
#extraProperties: |
# <property>
# <name></name>
# <value></value>
# </property>
# AWS S3
# For more details of S3 configuration, see https://docs.dremio.com/deployment/dist-store-config.html#amazon-s3
#
# bucketName: The name of the S3 bucket for distributed storage.
# path: The path, relative to the bucket, to create Dremio's directories.
# authentication: Valid types are: accessKeySecret, instanceMetadata, or awsProfile.
# - Note: Instance metadata is only supported in AWS EKS and requires that the
# EKS worker node IAM role is configured with sufficient access rights. At this time,
# Dremio does not support using a Kubernetes service account-based IAM role.
# credentials: If using accessKeySecret authentication, uncomment the credentials section below.
aws:
bucketName: "AWS Bucket Name"
path: "/"
authentication: "metadata"
# If using accessKeySecret for authentication against S3, uncomment the lines below and use the values
# to configure the appropriate credentials.
#
#credentials:
# accessKey: "AWS Access Key"
# secret: "AWS Secret"
#
# If using awsProfile for authentication against S3, uncomment the lines below and use the values
# to choose the appropriate profile.
#
#credentials:
# awsProfileName: "default"
#
# Extra Properties
# Use the extra properties block to provide additional parameters to configure the distributed
# storage in the generated core-site.xml file.
#
#extraProperties: |
# <property>
# <name></name>
# <value></value>
# </property>
# Azure Storage Gen2
# For more details of Azure Storage Gen2 storage configuration, see
# https://docs.dremio.com/deployment/dist-store-config.html#azure-storage
#
# accountName: The name of the storage account.
# authentication: Valid types are: accessKey or entraID
# filesystem: The name of the blob container to use within the storage account.
# path: The path, relative to the filesystem, to create Dremio's directories.
# credentials:
azureStorage:
accountName: "Azure Storage Account Name"
authentication: "accessKey"
filesystem: "Azure Storage Account Blob Container"
path: "/"
credentials:
# If using accessKey for authentication against Azure Storage, uncomment the lines below and use the values
# to configure the appropriate credentials.
#accessKey: "Azure Storage Account Access Key"
# If using entraID for authentication against Azure Storage, uncomment the lines below and use the values
# to configure the appropriate credentials.
#clientId: "Azure Application Client ID"
#tokenEndpoint: "Azure Entra ID Token Endpoint"
#clientSecret: "Azure Application Client Secret"
# Extra Properties
# Use the extra properties block to provide additional parameters to configure the distributed
# storage in the generated core-site.xml file.
#
#extraProperties: |
# <property>
# <name></name>
# <value></value>
# </property>
Configuring Storage for Dremio Catalog
To use Dremio Catalog, configure the storage settings based on your storage provider (such as Amazon S3 or Azure Storage). This configuration is required to enable support for vended credentials and to allow access to the table metadata necessary for Iceberg table operations.
-
In the
values-overrides.yaml
file, find the section to configure your storage provider:catalog:
...
storage:
location: ...
type: ...
... -
To configure it, select the tab for your storage provider, and follow the steps:
- Amazon S3
- S3-compatible
- Azure Storage
To use Dremio Catalog with Amazon S3, do the following:
-
Create an IAM user or use an existing IAM user for Dremio Catalog.
-
Create an IAM policy that grants access to your S3 location. For example:
Example of a policy{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:GetObjectVersion",
"s3:DeleteObject",
"s3:DeleteObjectVersion"
],
"Resource": "arn:aws:s3:::<my_bucket>/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::<my_bucket>",
"Condition": {
"StringLike": {
"s3:prefix": [
"*"
]
}
}
}
]
} -
Create an IAM role to grant privileges to S3 location.
- In your AWS console, select Create Role.
- Enter an externalId. For example,
my_catalog_external_id
. - Attach the policy created in the previous step and create the role.
-
Create IAM user permissions to access the bucket via STS:
-
Select the IAM role created in the previous step.
-
Edit the trust policy and add the following:
Trust policy{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "<dremio_catalog_user_arn>"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "<dremio_catalog_external_id>"
}
}
}
]
}Replace the following values with the ones obtained in the previous steps:
<dremio_catalog_user_arn>
- The IAM user that was created in the first step.<dremio_catalog_external_id>
: The external id that was created in third step.
noteThe
sts:AssumeRole
permission is required for Dremio Catalog to function with vended credentials, as it relies on the STS temporary token to perform these validations.
-
-
Configure Dremio Catalog in the
values-overrides.yaml
file as follows:catalog:
...
storage:
location: s3://<your_bucket>/<your_folder>
type: S3
s3:
region: <bucket_region>
roleArn: <dremio_catalog_iam_role> // The role that was created in step 3
userArn: <dremio_catalog_user_arn> // The IAM user that was created in step 1
externalId: <dremio_catalog_external_id> // The external id that was created in step 3
useAccessKeys: false // Set it to true if you intend to use accessKeys. See the note below.noteIf your role requires AWS Secret Keys to access the bucket and STS, you must create a Kubernetes secret named
Example for Amazon S3 using an access key and a secret keycatalog-server-s3-storage-creds
to access the configured location. Below is a simple example of Amazon S3 using an access key and a secret key:export AWS_ACCESS_KEY_ID=<access-key>
export AWS_SECRET_ACCESS_KEY=<secret-key>
kubectl create secret generic catalog-server-s3-storage-creds \ --namespace $NAMESPACE \
--from-literal awsAccessKeyId=$AWS_ACCESS_KEY_ID \
--from-literal awsSecretAccessKey=$AWS_SECRET_ACCESS_KEY
Prerequisites- The access keys must have permissions to access the bucket and the STS server.
- In the Dremio console, select Master Credentials when adding Dremio Catalog.
To use Dremio Catalog with S3-compatible storage, do the following:
-
Create a Kubernetes secret named
catalog-server-s3-storage-creds
to access the configured location. Here is an example for S3 using an access key and secret key:export AWS_ACCESS_KEY_ID=<username>
export AWS_SECRET_ACCESS_KEY=<password>
kubectl create secret generic catalog-server-s3-storage-creds \
--namespace $NAMESPACE \
--from-literal awsAccessKeyId=$AWS_ACCESS_KEY_ID \
--from-literal awsSecretAccessKey=$AWS_SECRET_ACCESS_KEYFor S3-compatible storage providers (e.g., MinIO), the access keys should be the username and password.
-
For this step, select the tab for whether the S3-compatible storage has STS support or not, and follow the instructions:
- Has STS support
- No STS support
Dremio Catalog uses STS as a mechanism to perform credentials vending so, configure Dremio Catalog in the
values-overrides.yaml
file as follows:catalog:
...
storage:
location: s3://<your_bucket/<your_folder>
type: S3
s3:
region: <bucket_region>
roleArn: arn:aws:iam::000000000000:role/catalog-access-role // This doesn't matter, it is a dummy role.
endpoint: <s3-compatible-server-url> // This is the S3 server url, for example to MinIO http://<minio-host>:<minio-port
stsEndpoint: <s3-compatible-sts-server-url> // This is the STS server url, for example to MinIO http://<minio-host>:<minio-port
pathStyleAccess: true // Mandatory to be true
useAccessKeys: true // Mandatory to be trueVended credentials will not work and, in such cases, you must use "Master Credentials" in Dremio and provide explicit access keys for external engines where they are required.
Once the Kubernetes secrets for the access keys have been created, configure Dremio Catalog in the
values-overrides.yaml
file as follows:catalog:
...
storage:
location: s3://<your_bucket/<your_folder>
type: S3
s3:
region: <bucket_region>
roleArn: arn:aws:iam::000000000000:role/catalog-access-role // This doesn't matter, it is a dummy role.
endpoint: <s3-compatible-server-url> // This is the S3 server url, for example to MinIO http://<minio-host>:<minio-port
pathStyleAccess: true // Mandatory to be true
skipSts: true // Mandatory to be true
useAccessKeys: true // Mandatory to be true
To use Dremio Catalog with Azure Storage, do the following:
-
Register an application and create secrets:
-
Go to Azure Active Directory > App Registrations.
-
Register your app, and take note of the Client ID and Tenant ID. For more information on these steps, refer to Register an application with Microsoft Entra ID and create a service principal.
-
Go to Certificates & Secrets > New Client Secret.
-
Create a secret, and take note of the Secret Value.
-
Create a Kubernetes secret named
catalog-server-azure-storage-creds
using the following command:export AZURE_CLIENT_ID=<Azure App client id>
export AZURE_CLIENT_SECRET=<App secret value>
kubectl create secret generic catalog-server-azure-storage-creds \
--namespace $NAMESPACE \
--from-literal azureClientId=$AZURE_CLIENT_ID \
--from-literal azureClientSecret=$AZURE_CLIENT_SECRET
-
-
Create an IAM role in your Storage Account and set up the permission for your new application to access the storage account by following these steps:
- In the Azure console, go to your Storage Account and navigate to Access Control (IAM) > Role assignments > Add role assignment.
- Select the
Storage Blob Data Contributor
role and click Next. - In the Members section, click on Select members, search for your app registration from step 1 and click Select.
- Review and assign the roles.
-
Configure Dremio Catalog in the
values-overrides.yaml
file as follows:catalog:
...
storage:
location: abfss://<container_name>@<storage_account>.dfs.core.windows.net/<path>
type: azure
azure:
tenantId: <Your Azure's directory tenant Id>
multiTenantAppName: ~ // Optional: Used only if you register an app with multi-tenants.
useClientSecrets: true // Has to be true
Configuring TLS for Dremio Catalog External Access
For clients connecting to Dremio Catalog from outside the namespace, TLS can be enabled for Dremio Catalog external access as follows:
- Enable external access with TLS and provide the TLS secret. See the section Creating a TLS Secret.
- In the
values-overrides.yaml
file, find the Dremio Catalog configuration section:catalog:
... - Configure TLS for Dremio Catalog as follows:
catalog:
externalAccess:
enabled: true
tls:
enabled: true
secret: dremio-tls-secret-catalog
Configuring Dremio Catalog when Coordinator Web is Using TLS
When the Dremio coordinator is using TLS for Web access (i.e., when coordinator.web.tls
is set to true
), then Dremio Catalog external access must be configured appropriately, or client authentication will fail. For that, configure Dremio Catalog as follows:
-
In the
values-overrides.yaml
file, find the Dremio Catalog configuration section:catalog:
... -
Configure Dremio Catalog as follows:
catalog:
externalAccess:
enabled: true
authentication:
authServerHostname: dremio-master-0.dremio-cluster-pod.{{ .Release.Namespace }}.svc.cluster.localThe
authServerHostname
must match the CN (or the SAN) field of the (master) coordinator Web TLS certificate.In case it does not match the CN or SAN fields of the TLS certificate, as a last resort, it is possible to disable hostname verification (
disableHostnameVerification: true
):catalog:
externalAccess:
enabled: true
authentication:
authServerHostname: dremio-master-0.dremio-cluster-pod.{{ .Release.Namespace }}.svc.cluster.local
disableHostnameVerification: true
Accessing Dremio's Helm Chart
You can perform more advanced configurations beyond those described in this topic. However, proceed with caution—making changes without a clear understanding may lead to unexpected or undesired behavior. To do an advanced configuration, you must pull Dremio’s Helm charts.
Pull the Helm charts using the following command:
helm pull oci://quay.io/dremio/dremio-helm --untar
Overriding Additional Values
After completing the helm pull
:
- Find the
values.yaml
file, open it, and check the configurations you want to override. - Copy what you want to override from the
values.yaml
tovalues-overrides.yaml
and configure the file with your values. - Save the
values-overrides.yaml
file.
Once done with the configuration, deploy Dremio to Kubernetes via the OCI Repo. See how in Deploying Dremio to Kubernetes.
Additions and Modifications to Dremio's Configuration Files (including Hive)
The modifications described in this section require installing Dremio using a local version of the Helm charts. Thus, the helm install
command must reference a local folder, not the OCI repo like Quay. For more information and sample commands, see Helm install.
After completing the helm pull
, locate the /config
directory, which contains:
File | Description |
---|---|
dremio.conf | Used to specify various options related to node roles, metadata storage, distributed cache storage, and more. If you want to customize your Dremio services, see Dremio Services Configuration. |
dremio-env | Used for setting Java options and log directories. If you want to customize your Dremio environment, see Dremio Environment Configuration. |
logback-access.xml | Used to control the log access. |
logback.xml | Used to control the log levels. |
Add deployment-specific files, e.g., core-site.xml
(required for Hive), by copying your file(s) to this directory. Any customizations to your Dremio environment are propagated to all the pods when installing or upgrading the deployment.