Skip to main content
Version: current [26.x Preview]

Configuring Your Values to Deploy Dremio to Kubernetes

Helm is a standard for managing Kubernetes applications, and the Helm chart defines how applications are deployed to Kubernetes. Dremio's Helm chart contains the default deployment configurations, which are specified in the values.yaml.

Dremio recommends configuring your deployment values in a separate .yaml file since it will allow simpler updates to the latest version of the Helm chart by copying the separate configuration file across Helm chart updates.

Configuring Your Values

To configure your deployment values, do the following:

  1. Download the file values-overrides.yaml and save it locally.

    The values-overrides.yaml configuration file

    # A Dremio License is required
    dremio:
    license: "<your license key>"
    tag: 26.0.0

    # To pull images from Dremio's Quay you must create a image pull secret. For more info see:
    # https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
    # All of the images are pulled using this same secret.
    imagePullSecrets:
    - <your-pull-secret-name>

    coordinator:
    auth:
    type: "internal"
    client:
    tls:
    enabled: false
    secret: "<your-tls-secret-name>"
    flight:
    tls:
    enabled: false
    secret: "<your-tls-secret-name>"
    web:
    tls:
    enabled: false
    secret: "<your-tls-secret-name>"
    volumeSize: 512Gi
    resources:
    limits:
    memory: 64Gi
    requests:
    cpu: 16
    memory: 60Gi

    # Where Dremio stores metadata, reflections, and uploaded files.
    # For more information, see https://docs.dremio.com/current/what-is-dremio/architecture#distributed-storage
    distStorage:
    # The supported distributed storage types are: aws, gcp, or azureStorage. For S3-compatible storge use aws.
    type: <your-distributed-storage-type> # Add here your distributed storage template from http://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-the-distributed-storage

    catalog:
    externalAccess:
    enabled: true
    tls:
    enabled: false
    secret: "<your-catalog-tls-secret-name>"
    # This is where Iceberg tables created in your catalog will reside
    storage:
    # The supported catalog storage types are: S3 or azure. For S3-compatible storge use S3.
    type: <your-catalog-storage-type> # Add here your catalog storage template from http://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-the-catalog-storage

    service:
    type: LoadBalancer

  2. Edit the values-overrides.yaml file to configure your values. See the following sections for details on each configuration option:

  3. Save the values-overrides.yaml file.

Once done with the configuration, deploy Dremio to Kubernetes. See how in Deploying Dremio to Kubernetes.

License

Provide your license key. To obtain a license, see Licensing.
Perform this configuration in this section of the file:

dremio:
license: ...

Pull Secret

Provide the secret used to pull the images from Quay.io. To create the Kubernetes secret, use this example:

Properties for Kubernetes secret
kubectl create secret docker-registry dremio-docker-secret --docker-username=your_username --docker-password=your_password_for_username --docker-email=DOCKER_EMAIL

For more information, see Create a Secret by providing credentials on the command line (the Docker registry is quay.io). All of the images are pulled using this same secret.

note

Pods can only reference image pull secrets in their own namespace, so this process needs to be done on the namespace where Dremio is being deployed.

Perform this configuration in this section of the file:

imagePullSecrets:
- ...

Coordinator

  • Configure the volume size, resources limits, and resources requests. To configure these values, see the section Recommended Resources Configuration.
    Perform this configuration in this section of the file:

    coordinator:
    resources:
    requests:
    cpu: ...
    memory: ...
    volumeSize: ...
  • (Optional) Configure Authentication via an Identity Provider, including LDAP, Microsoft Entra ID, or generic Open ID providers). This requires an additional config file provided during Dremio's deployment. See our Identity Provider documentation for instructions on how to create the required config file for your auth type. Possible types include: azuread, ldap, oauth, or oauth+ldap. Perform this configuration in this section of the file:

    coordinator:
    auth:
    type: ...
  • (Optional) Enable TLS (set enabled: true) and provide the TLS secret. See the section Creating a TLS Secret.
    Perform this configuration in this section of the file:

    coordinator:
    client:
    tls:
    enabled: ...
    secret: ...
    flight:
    tls:
    enabled: ...
    secret: ...
    web:
    tls:
    enabled: ...
    secret: ...
    note

    If Web TLS is enabled, see the section Configuring Dremio Catalog when Coordinator Web is Using TLS.

Coordinator's Distributed Storage

This is where Dremio stores metadata, reflections, and uploaded files. To configure these values, see the section Configuring the Distributed Storage.
Perform this configuration in this section of the file:

distStorage:
type: ...

Dremio Catalog

  • Configuring storage for Dremio Catalog is mandatory since this is the location where Iceberg tables created in the Catalog will be written. For configuring the storage, see the section Configuring Storage for Dremio Catalog.
    Perform this configuration in this section of the file:
    catalog:
    externalAccess:
    enabled: ...
  • (Optional) Use TLS for external access: clients connecting to Dremio Catalog from outside the namespace will be required to use TLS. To configure it, see the section Configuring TLS for Dremio Catalog External Access.
    Perform this configuration in this section of the file:
    catalog:
    externalAccess:
    enabled: ...
    tls:
    enabled: ...
    secret: ...
  • (Optional) If Dremio coordinator Web access is using TLS, additional configuration is necessary. To configure it, see the section Configuring Dremio Catalog When Coordinator Web Is Using TLS.
    Perform this configuration in this section of the file:
    catalog:
    externalAccess:
    enabled: ...
    authentication:
    authServerHostname: ...

Save the values-overrides.yaml file.

Once done with the configuration, deploy Dremio to Kubernetes. See how in Deploying Dremio to Kubernetes.

Configuring Your Values - Advanced

Dremio Platform Images

The Dremio platform requires 17 images when running fully featured. All images are published by Dremio to our Quay and are listed below. If you want to use a private mirror of our repository, add the snippets bellow to values-overrides.yaml to repoint to your own.

Dremio Platform Images
note

If creating a private mirror, use the same repository names and tags from Dremio's Quay.io.
This is important for supportability.

dremio:
image:
repository: .../dremio-ee
tag: <The image tag from Quay.io>
busyBox:
image:
repository: .../busybox
tag: <The image tag from Quay.io>
k8s:
image:
repository: .../alpine/k8s
tag: <The image tag from Quay.io>
engine:
operator:
image:
repository: .../dremio-engine-operator
tag: <The image tag from Quay.io>
zookeeper:
image:
repository: .../zookeeper
tag: <The image tag from Quay.io>
opensearch:
image:
repository: .../dremio-search-opensearch
tag: <The image tag from Quay.io> # The tag version must be a valid opensearch version as listed here https://opensearch.org/docs/latest/version-history/
preInstallJob:
image:
repository: .../dremio-search-init
tag: <The image tag from Quay.io>
opensearchOperator:
manager:
image:
repository: .../dremio-opensearch-operator
tag: <The image tag from Quay.io>
kubeRbacProxy:
image:
repository: .../kube-rbac-proxy
tag: <The image tag from Quay.io>
mongodbOperator:
image:
repository: .../dremio-mongodb-operator
tag: <The image tag from Quay.io>
mongodb:
image:
repository: .../percona-server-mongodb
tag: <The image tag from Quay.io>
metrics:
image:
repository: .../mongodb_exporter
tag: <The image tag from Quay.io>
catalogservices:
image:
repository: .../dremio-ee-catalog-services-server
tag: <The image tag from Quay.io>
catalog:
image:
repository: .../dremio-catalog-server
tag: <The image tag from Quay.io>
externaAccess:
image:
repository: .../dremio-catalog-server-external
tag: <The image tag from Quay.io>
nats:
container:
image:
repository: .../nats
tag: <The image tag from Quay.io>
telemetry:
image:
repository: .../opentelemetry-collector-contrib
tag: <The image tag from Quay.io>

Scale-out Coordinators

Dremio can scale to support high concurrency use cases through scaling coordinators. Multiple stateless coordinators rely on the primary coordinator to manage Dremio's state, enabling Dremio to support many more concurrent users. These scale-out coordinators are intended for high query throughput and are not applicable for standby or disaster recovery. While scale-out coordinators generally reduce the load on the primary coordinator, the primary coordinator's vCPU request should be increased for every two scale-outs added to avoid negatively impacting performance.

Perform this configuration in this section of the file, where count refers to the number of scale-outs. A count of 0 will provision only the primary coordinator:

      coordinator:
count: ...
note

When using scale-out coordinators, the load balancer session affinity should be enhanced. See: Advanced Load Balancer Configuration.

Configuring Kubernetes Pod Metadata (Including Node Selector)

It's also possible to add metadata to each of the statefulsets (coordinators, classic engines, ZooKeeper, etc.). This includes configuring a node selector, which configures pods to use specific node pools (AKS node pools and EKS node groups). The following metadata can be added:

annotations: {}
podAnnotations: {}
labels: {}
podLabels: {}
nodeSelector: {}
tolerations: []

Example of a coordinator node selector, where the node pool is named coordinatorpool:

coordinator:
nodeSelector:
agentpool: coordinatorpool

Advanced Load Balancer Configuration

Dremio will create a public load balancer by default, and the Dremio Client service will provide an external IP to connect to Dremio. For more information, see Connecting to the Dremio Console.

  • Private Cluster - For private Kubernetes clusters (no public endpoint), set internalLoadBalancer: true Perform this configuration in this section of the file:

    service:
    type: ...
    internalLoadBalancer: ...
  • Static IP - To define a static IP for your load balancer, set loadBalancerIP: <your-static-IP>. If unset, an available IP will be assigned upon creation of the load balancer. Perform this configuration in this section of the file:

    service:
    type: ...
    loadBalancerIP: ...
    note

    This can be helpful if DNS is configured to expect Dremio to have a specific IP.

  • Session Afinity - If leveraging Scale-out Coordinators, set sessionAffinity: true. Perform this configuration in this section of the file:

    service:
    type: ...
    sessionAffinity: ...

Advanced TLS Configuration for OpenSearch

Dremio generates TLS certificates by default for OpenSearch and they are rotated monthly. However, if you want to have your own, you need to create two secrets containing the relevant certificates. The format of the secrets is different from the other TLS secrets shown on this page, and the tls.crt, tls.key, and ca.crt files must be in PEM format. Use the example below as reference to create your secrets:

kubectl create secret generic opensearch-tls-certs \
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt

kubectl create secret generic opensearch-tls-certs-admin \
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt

Add the snippet below to the values-overrides.yaml file before deploying Dremio. Note that the second certificate is not configured but is required for OpenSearch to work.

opensearch:
tlsCertsSecretName: opensearch-tls-certs
disableTlsCertGeneration: true

Advanced Configuration of Engines

Dremio's default resource offset is reserve-2-8, where the first value represents 2 vCPUs and the second represents 8 GB of RAM. If you need to change this default for your created engines, add the following snippet to values-overrides.yaml and set the defaultOffset to one of the configurable offsets listed below, which are available out of the box:

  • reserve-0-0
  • reserve-2-4
  • reserve-2-8
  • reserve-2-16

The listed values are keys and thus must be provided in this exact format into the snippet below.

engine:
options:
resourceAllocationOffsets:
defaultOffset: <key from list above>

Configuration of Classic Engines

note
  • You should only use classic engines if the new engines (as of Dremio 26.0) are not appropriate for your use case.
  • Classic engines will not auto-start/auto-stop, which is only possible with the new engines.

The classic way of configuring engines is still supported, and you can add this snippet to values-overrides.yaml as part of the deployment. Note that this snippet is a configuration example, and you should adjust the values to your own case.

An example of a configuration of a classic engine
executor:
resources:
requests:
cpu: "16"
memory: "120Gi"
limits:
memory: "120Gi"
engines: ["default"]
count: 3
volumeSize: 128Gi
cloudCache:
enabled: true
volumes:
- size: 128Gi

References

The table in this section contains the recommended values for resources requests and volume size to configure Dremio components. In the values-overrides.yaml file, set the following values:

  resources:
requests:
memory: # Put here the first value in the table column.
cpu: # Put here the second value in the table column.
volumeSize: # Put here the third value in the table column, if any.

Dremio recommends using the Basic Configuration values for evaluation or testing purposes and adjusting them as you go towards the values in Production Configuration, which are the values Dremio recommends to operate in a production environment.

Dremio ComponentBasic ConfigurationProduction ConfigurationPod Count
Coordinator8Gi, 4, 50Gi64Gi, 32, 512Gi1
Catalog Server8Gi, 48Gi, 41
Catalog Server (External)8Gi, 48Gi, 41
Catalog Service Server8Gi, 48Gi, 41
Engine Operator1Gi, 11Gi, 11
OpenSearch8Gi, 1500m, 10Gi16Gi, 2, 100Gi3
MongoDB2Gi, 4, 50Gi4Gi, 8, 512Gi3
NATS1Gi, 700m1Gi, 700m3
ZooKeeper1Gi, 500m1Gi, 500m3
Open Telemetry1Gi, 11Gi, 11

Expand this section below for Dremio platform components resource YAML snippets:

Dremio Platform Resource Configuration YAML
coordinator:
resources:
requests:
cpu: "32"
memory: "64Gi"
limits:
memory: "64Gi"
volumeSize: "512Gi"
zookeeper:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
memory: "1Gi"
volumeSize: "10Gi"
catalog:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
catalogservices:
resources:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
mongodb:
resources:
requests:
cpu: "2"
memory: "2Gi"
limits:
cpu: "4"
memory: "2Gi"
storage:
resources:
requests:
storage: "512Gi"
opensearch:
resources:
requests:
memory: "16Gi"
cpu: "2"
limits:
memory: "16Gi"
cpu: "2"
nats:
resources:
requests:
cpu: "500m"
memory: "1024Mi"
limits:
cpu: "750m"
memory: "1536Mi"
telemetry:
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "2"
memory: "2Gi"

Creating a TLS Secret

If you have enabled TLS in your values-overrides.yaml, the corresponding secrets must be created before deploying Dremio. To create a secret, run the following command:

kubectl create secret tls <your-tls-secret-name> --key privkey.pem --cert cert.pem

For more information, see kubectl create secret tls.

caution

TLS for OpenSearch requires a secret of a different makeup. See Advanced TLS Configuration for OpenSearch.

Configuring the Distributed Storage

Dremio’s distributed store uses scalable and fault-tolerant storage and it is configured as follows:

  1. In the values-overrides.yaml file, find the section with distStorage: and type:

    distStorage:
    type: ...
  2. In type:, configure your storage provider with one of the following values:

    • "gcp" - For GCP Cloud Storage.
    • "aws" - For AWS S3 or S3-compatible storage.
    • "azureStorage" - For Azure Storage.
  3. Select the tab below for the storage provider you have configured in step 2, copy the template, paste it below the line with type:, and configure your distributed storage values.

  # Google Cloud Storage
#
# bucketName: The name of the GCS bucket for distributed storage.
# path: The path, relative to the bucket, to create Dremio's directories.
# authentication: Valid types are: serviceAccountKeys or auto.
# - When using "auto" authentication, Dremio uses Google Application Default Credentials to
# authenticate. This is platform-dependent and may not be available in all Kubernetes clusters.
# - Note: When using a GCS bucket on GKE, we recommend enabling Workload Identity and configuring
# a Kubernetes Service Accountfor Dremio with an associated workload identity that
# has access to the GCS bucket.
# credentials: If using serviceAccountKeys authentication, uncomment the credentials section below.
gcp:
bucketName: "GCS Bucket Name"
path: "/"
authentication: "auto"

# If using serviceAccountKeys, uncomment the section below, referencing the values from
# the service account credentials JSON file that you generated:
#
#credentials:
# projectId: GCP Project ID that the Google Cloud Storage bucket belongs to.
# clientId: Client ID for the service account that has access to the Google Cloud Storage bucket.
# clientEmail: Email for the service account that has access to the Google Cloud Storage bucket.
# privateKeyId: Private key ID for the service account that has access to Google Cloud Storage bucket.
# privateKey: |-
# -----BEGIN PRIVATE KEY-----\n Replace me with full private key value. \n-----END PRIVATE KEY-----\n

# Extra Properties
# Use the extra properties block to provide additional parameters to configure the distributed
# storage in the generated core-site.xml file.
#
#extraProperties: |
# <property>
# <name></name>
# <value></value>
# </property>

Configuring Storage for Dremio Catalog

To use Dremio Catalog, configure the storage settings based on your storage provider (such as Amazon S3 or Azure Storage). This configuration is required to enable support for vended credentials and to allow access to the table metadata necessary for Iceberg table operations.

  1. In the values-overrides.yaml file, find the section to configure your storage provider:

    catalog:
    ...
    storage:
    location: ...
    type: ...
    ...
  2. To configure it, select the tab for your storage provider, and follow the steps:

    To use Dremio Catalog with Amazon S3, do the following:

    1. Create an IAM user or use an existing IAM user for Dremio Catalog.

    2. Create an IAM policy that grants access to your S3 location. For example:

      Example of a policy
      {
      "Version": "2012-10-17",
      "Statement": [
      {
      "Effect": "Allow",
      "Action": [
      "s3:PutObject",
      "s3:GetObject",
      "s3:GetObjectVersion",
      "s3:DeleteObject",
      "s3:DeleteObjectVersion"
      ],
      "Resource": "arn:aws:s3:::<my_bucket>/*"
      },
      {
      "Effect": "Allow",
      "Action": [
      "s3:ListBucket",
      "s3:GetBucketLocation"
      ],
      "Resource": "arn:aws:s3:::<my_bucket>",
      "Condition": {
      "StringLike": {
      "s3:prefix": [
      "*"
      ]
      }
      }
      }
      ]
      }
    3. Create an IAM role to grant privileges to S3 location.

      1. In your AWS console, select Create Role.
      2. Enter an externalId. For example, my_catalog_external_id.
      3. Attach the policy created in the previous step and create the role.
    4. Create IAM user permissions to access the bucket via STS:

      1. Select the IAM role created in the previous step.

      2. Edit the trust policy and add the following:

        Trust policy
        {
        "Version": "2012-10-17",
        "Statement": [
        {
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
        "AWS": "<dremio_catalog_user_arn>"
        },
        "Action": "sts:AssumeRole",
        "Condition": {
        "StringEquals": {
        "sts:ExternalId": "<dremio_catalog_external_id>"
        }
        }
        }
        ]
        }

        Replace the following values with the ones obtained in the previous steps:

        • <dremio_catalog_user_arn> - The IAM user that was created in the first step.
        • <dremio_catalog_external_id>: The external id that was created in third step.
        note

        The sts:AssumeRole permission is required for Dremio Catalog to function with vended credentials, as it relies on the STS temporary token to perform these validations.

    5. Configure Dremio Catalog in the values-overrides.yaml file as follows:

      catalog:
      ...
      storage:
      location: s3://<your_bucket>/<your_folder>
      type: S3
      s3:
      region: <bucket_region>
      roleArn: <dremio_catalog_iam_role> // The role that was created in step 3
      userArn: <dremio_catalog_user_arn> // The IAM user that was created in step 1
      externalId: <dremio_catalog_external_id> // The external id that was created in step 3
      useAccessKeys: false // Set it to true if you intend to use accessKeys. See the note below.
      note

      If your role requires AWS Secret Keys to access the bucket and STS, you must create a Kubernetes secret named catalog-server-s3-storage-creds to access the configured location. Below is a simple example of Amazon S3 using an access key and a secret key:

      Example for Amazon S3 using an access key and a secret key
      export AWS_ACCESS_KEY_ID=<access-key> 
      export AWS_SECRET_ACCESS_KEY=<secret-key>
      kubectl create secret generic catalog-server-s3-storage-creds \ --namespace $NAMESPACE \
      --from-literal awsAccessKeyId=$AWS_ACCESS_KEY_ID \
      --from-literal awsSecretAccessKey=$AWS_SECRET_ACCESS_KEY

Configuring TLS for Dremio Catalog External Access

For clients connecting to Dremio Catalog from outside the namespace, TLS can be enabled for Dremio Catalog external access as follows:

  1. Enable external access with TLS and provide the TLS secret. See the section Creating a TLS Secret.
  2. In the values-overrides.yaml file, find the Dremio Catalog configuration section:
    catalog:
    ...
  3. Configure TLS for Dremio Catalog as follows:
    catalog:
    externalAccess:
    enabled: true
    tls:
    enabled: true
    secret: dremio-tls-secret-catalog

Configuring Dremio Catalog when Coordinator Web is Using TLS

When the Dremio coordinator is using TLS for Web access (i.e., when coordinator.web.tls is set to true), then Dremio Catalog external access must be configured appropriately, or client authentication will fail. For that, configure Dremio Catalog as follows:

  1. In the values-overrides.yaml file, find the Dremio Catalog configuration section:

    catalog:
    ...
  2. Configure Dremio Catalog as follows:

    catalog:
    externalAccess:
    enabled: true
    authentication:
    authServerHostname: dremio-master-0.dremio-cluster-pod.{{ .Release.Namespace }}.svc.cluster.local

    The authServerHostname must match the CN (or the SAN) field of the (master) coordinator Web TLS certificate.

    In case it does not match the CN or SAN fields of the TLS certificate, as a last resort, it is possible to disable hostname verification (disableHostnameVerification: true):

    catalog:
    externalAccess:
    enabled: true
    authentication:
    authServerHostname: dremio-master-0.dremio-cluster-pod.{{ .Release.Namespace }}.svc.cluster.local
    disableHostnameVerification: true

Accessing Dremio's Helm Chart

You can perform more advanced configurations beyond those described in this topic. However, proceed with caution—making changes without a clear understanding may lead to unexpected or undesired behavior. To do an advanced configuration, you must pull Dremio’s Helm charts.

Pull the Helm charts using the following command:

helm pull oci://quay.io/dremio/dremio-helm --untar

Overriding Additional Values

After completing the helm pull:

  1. Find the values.yaml file, open it, and check the configurations you want to override.
  2. Copy what you want to override from the values.yaml to values-overrides.yaml and configure the file with your values.
  3. Save the values-overrides.yaml file.

Once done with the configuration, deploy Dremio to Kubernetes via the OCI Repo. See how in Deploying Dremio to Kubernetes.

Additions and Modifications to Dremio's Configuration Files (including Hive)

important

The modifications described in this section require installing Dremio using a local version of the Helm charts. Thus, the helm install command must reference a local folder, not the OCI repo like Quay. For more information and sample commands, see Helm install.

After completing the helm pull, locate the /config directory, which contains:

FileDescription
dremio.confUsed to specify various options related to node roles, metadata storage, distributed cache storage, and more.
If you want to customize your Dremio services, see Dremio Services Configuration.
dremio-envUsed for setting Java options and log directories.
If you want to customize your Dremio environment, see Dremio Environment Configuration.
logback-access.xmlUsed to control the log access.
logback.xmlUsed to control the log levels.

Add deployment-specific files, e.g., core-site.xml (required for Hive), by copying your file(s) to this directory. Any customizations to your Dremio environment are propagated to all the pods when installing or upgrading the deployment.