Configuring Your Values to Deploy Dremio to Kubernetes
Helm is a standard for managing Kubernetes applications, and the Helm chart defines how applications are deployed to Kubernetes. Dremio's Helm chart contains the default deployment configurations, which are specified in the values.yaml
.
Dremio recommends configuring your deployment values in a separate .yaml
file since it will allow simpler updates to the latest version of the Helm chart by copying the separate configuration file across Helm chart updates.
Configuring Your Values
Skip step 1 if deploying a Free Trial. Configure your values in the values-overrides.yaml
file you downloaded using the link in the email received during the Free Trial registration.
To configure your deployment values, do the following:
-
Download the file
values-overrides.yaml
and save it locally.The
values-overrides.yaml
configuration file
# A Dremio License is required
dremio:
license: "<your license key>"
tag: 26.0.0
# To pull images from Dremio's Quay you must create a image pull secret. For more info see:
# https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
# All of the images are pulled using this same secret.
imagePullSecrets:
- <your-pull-secret-name>
coordinator:
web:
auth:
type: "internal"
tls:
enabled: false
secret: "<your-tls-secret-name>"
client:
tls:
enabled: false
secret: "<your-tls-secret-name>"
flight:
tls:
enabled: false
secret: "<your-tls-secret-name>"
volumeSize: 512Gi
resources:
limits:
memory: 64Gi
requests:
cpu: 16
memory: 60Gi
# Where Dremio stores metadata, Reflections, and uploaded files.
# For more information, see https://docs.dremio.com/current/what-is-dremio/architecture#distributed-storage
distStorage:
# The supported distributed storage types are: aws, gcp, or azureStorage. For S3-compatible storage use aws.
type: <your-distributed-storage-type> # Add here your distributed storage template from http://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-the-distributed-storage
catalog:
externalAccess:
enabled: true
tls:
enabled: false
secret: "<your-catalog-tls-secret-name>"
# This is where Iceberg tables created in your catalog will reside
storage:
# The supported catalog storage types are: S3 or azure. For S3-compatible storage use S3.
type: <your-catalog-storage-type> # Add here your catalog storage template from https://docs.dremio.com/current/deploy-dremio/configuring-kubernetes/#configuring-storage-for-dremio-catalog
service:
type: LoadBalancer -
Edit the
values-overrides.yaml
file to configure your values. See the following sections for details on each configuration option:- License
- Pull Secret
- Coordinator
- Coordinator's Distributed Storage
- Dremio Catalog
- Advanced Values Configurations
importantIn all code examples,
...
denotes additional values that have been omitted.Group all values associated with a given parent key in the YAML under a single instance of that parent, for example:
DoDo notdremio:
key-one: value-one
key-two:
key-three: value-twodremio:
key-one: value-one
dremio:
key-two:
key-three: value-twoPlease note the parent relationships at the top of each YAML snippet and subsequent values throughout this section. The hierarchy of keys and indentations in YAML must be respected.
-
Save the
values-overrides.yaml
file.
Once done with the configuration, deploy Dremio to Kubernetes. See how in Deploying Dremio to Kubernetes.
License
Provide your license key. To obtain a license, see Licensing.
Add this configuration under the parent, as shown in the following example:
dremio:
license: "<license-goes-here>"
...
Pull Secret
Provide the secret used to pull the images from Quay.io as follows:
-
Log in to Quay.io, select your account name at the top right corner, and select Account Settings in the drop-down menu.
-
Click Generate Encrypted Password, type your password, and click Verify.
-
On the next dialog, select Kubernetes Secret, and follow steps 1 and 2 to download the secret and run the command to submit the secret to the cluster.
-
Add the configuration under the parent, as shown in the following example:
imagePullSecrets:
- <your-quayio-secret-name>
Coordinator
Resource Configuration
Configure the volume size, resources limits, and resources requests. To configure these values, see Recommended Resources Configuration.
Add this configuration under the parents, as shown in the following example:
coordinator:
resources:
requests:
cpu: 15
memory: 30Gi
volumeSize: 100Gi
...
Identity Provider
Optionally, you can configure authentication via an identity provider. Each type of identity provider requires an additional configuration file provided during Dremio's deployment.
Select the authentication type
, and follow the corresponding link for instructions on how to create the associated configuration file:
azuread
- See how to configure Microsoft Entra ID with user and group lookup.ldap
- See how to configure Dremio for LDAP.oauth
- See how to configure Dremio for OpenID.oauth+ldap
- See how to configure Dremio for Hybrid OpenID+LDAP.
Add this configuration under the parents, as shown in the following example:
coordinator:
web:
auth:
type: <auth-type>
...
The identity provider configuration file can be embedded in your values-overrides.yaml
. To do this, use the ssoFile
option and provide the JSON content constructed per the instructions linked above. Here is an example for Microsoft Entra ID:
coordinator:
web:
auth:
enabled: true
type: "azuread"
ssoFile: |
{
"oAuthConfig": {
"clientId": "<my-client-id>",
"clientSecret": "<my-secret>",
"redirectUrl": "<my-redirect-url>",
"authorityUrl": "https://login.microsoftonline.com/<my-tenant-id>/v2.0",
"scope": "openid profile",
"jwtClaims": {
"userName": "preferred_username"
}
}
}
...
For examples for the other types, see Identity Providers
This is not the only configuration file that can be embedded inside the values-overrides.yaml
file. However, these are generally used for advanced configurations. For more information, see Additional Configuration.
Transport Level Security
Optionally enable the desired level of TLS by setting enabled: true
for client, Arrow Flight, or web TLS. To provide the TLS secret, see Creating a TLS Secret.
Add this configuration under the parents, as shown in the following example:
coordinator:
client:
tls:
enabled: false
secret: <my-tls-secret>
flight:
tls:
enabled: false
secret: <my-tls-secret>
web:
tls:
enabled: false
secret: <my-tls-secret>
...
If Web TLS is enabled, see Configuring Dremio Catalog when Coordinator Web is Using TLS.
Coordinator's Distributed Storage
This is where Dremio stores metadata, Reflections, and uploaded files, and it's required for Dremio to be operational. The supported types are AWS S3 or S3-compatible storage, Azure Storage, and Google Cloud Storage (GCS). For examples of configurations, see Configuring the Distributed Storage. Add this configuration under the parent, as shown in the following example:
distStorage:
type: "<my-dist-store-type>"
...
Dremio Catalog
The configuration for Dremio Catalog has several options:
-
Configuring storage for Dremio Catalog is mandatory since this is the location where Iceberg tables created in the Catalog will be written. For configuring the storage, see Configuring Storage for Dremio Catalog.
Add this configuration under the parents, as shown in the following example:catalog:
externalAccess:
enabled: true
... -
(Optional) Use TLS for external access to require clients connecting to Dremio Catalog from outside the namespace to use TLS. To configure it, see Configuring TLS for Dremio Catalog External Access.
Add this configuration under the parents, as shown in the following example:catalog:
externalAccess:
enabled: true
tls:
enabled: false
secret: <my-catalog-tls-secret>
... -
(Optional) If Dremio coordinator web access is using TLS, additional configuration is necessary. To configure it, see Configuring Dremio Catalog When Coordinator Web Is Using TLS.
Add this configuration under the parents, as shown in the following example:catalog:
externalAccess:
enabled: true
authentication:
authServerHostname: <my-auth-server-host>
...
Save the values-overrides.yaml
file.
Once done with the configuration, deploy Dremio to Kubernetes. See how in the topic Deploying Dremio to Kubernetes.
Configuring Your Values - Advanced
Dremio Platform Images
The Dremio platform requires 18 images when running fully featured. All images are published by Dremio to our Quay and are listed below. If you want to use a private mirror of our repository, add the snippets bellow to values-overrides.yaml
to repoint to your own.
Dremio Platform Images
If creating a private mirror, use the same repository names and tags from Dremio's Quay.io.
This is important for supportability.
dremio:
image:
repository: quay.io/dremio/dremio-enterprise
tag: <The image tag from Quay.io>
busyBox:
image:
repository: quay.io/dremio/busybox
tag: <The image tag from Quay.io>
k8s:
image:
repository: quay.io/dremio/alpine/k8s
tag: <The image tag from Quay.io>
engine:
operator:
image:
repository: quay.io/dremio/dremio-engine-operator
tag: <The image tag from Quay.io>
zookeeper:
image:
repository: quay.io/dremio/zookeeper
tag: <The image tag from Quay.io>
opensearch:
image:
repository: quay.io/dremio/dremio-search-opensearch
tag: <The image tag from Quay.io> # The tag version must be a valid opensearch version as listed here https://opensearch.org/docs/latest/version-history/
preInstallJob:
image:
repository: quay.io/dremio/dremio-search-init
tag: <The image tag from Quay.io>
opensearchOperator:
manager:
image:
repository: quay.io/dremio/dremio-opensearch-operator
tag: <The image tag from Quay.io>
kubeRbacProxy:
image:
repository: quay.io/dremio/kubebuilder/kube-rbac-proxy
tag: <The image tag from Quay.io>
mongodbOperator:
image:
repository: quay.io/dremio/dremio-mongodb-operator
tag: <The image tag from Quay.io>
mongodb:
image:
repository: quay.io/dremio/percona/percona-server-mongodb
tag: <The image tag from Quay.io>
catalogservices:
image:
repository: quay.io/dremio/dremio-catalog-services-server
tag: <The image tag from Quay.io>
catalog:
image:
repository: quay.io/dremio/dremio-catalog-server
tag: <The image tag from Quay.io>
externaAccess:
image:
repository: quay.io/dremio/dremio-catalog-server-external
tag: <The image tag from Quay.io>
nats:
container:
image:
repository: quay.io/dremio/nats
tag: <The image tag from Quay.io>
reloader:
image:
repository: quay.io/dremio/natsio/nats-server-config-reloader
tag: <The image tag from Quay.io>
natsBox:
container:
image:
repository: quay.io/dremio/natsio/nats-box
tag: <The image tag from Quay.io>
telemetry:
image:
repository: quay.io/dremio/otel/opentelemetry-collector-contrib
tag: <The image tag from Quay.io>
Scale-out Coordinators
Dremio can scale to support high concurrency use cases through scaling coordinators. Multiple stateless coordinators rely on the primary coordinator to manage Dremio's state, enabling Dremio to support many more concurrent users. These scale-out coordinators are intended for high query throughput and are not applicable for standby or disaster recovery. While scale-out coordinators generally reduce the load on the primary coordinator, the primary coordinator's vCPU request should be increased for every two scale-outs added to avoid negatively impacting performance.
Perform this configuration in this section of the file, where count refers to the number of scale-outs. A count of 0 will provision only the primary coordinator:
coordinator:
count: 1
...
When using scale-out coordinators, the load balancer session affinity should be enhanced. See: Advanced Load Balancer Configuration.
Configuring Kubernetes Pod Metadata (including Node Selector)
It's possible to add metadata both globally and to each of the StatefulSets (coordinators, classic engines, ZooKeeper, etc.), including configuring a node selector for pods to use specific node pools.
Define these values with caution and foreknowledge of expected entries because any misconfiguration may result in Kubernetes being unable to schedule your pods.
Use the following options to add metadata:
-
Example of a global labellabels:
- Configured using key-value pairs as shown in the following examples:Example of StatefulSet labellabels:
foo: barcatalog:
labels:
foo: bar
...For more information on labels, see the Kubernetes documentation on Labels and Selectors.
-
Example of a global annotationannotations:
- Configured using key-value pairs as shown in the following examples.Example of a StatefulSet annotationannotations:
foo: barmongodb:
annotations:
foo: bar
...For more information on annotations, see the Kubernetes documentation on Annotations.
-
Example of a global tolerationtolerations:
- Configured using a specific structure as shown in the following examples:Example of a StatefulSet tolerationtolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"catalog:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
...For more information on tolerations, see the Kubernetes documentation on Taints and Tolerations.
-
Example of a global node selectornodeSelector:
- Configured using a specific structure as shown in the following examples.Example of a StatefulSet node selectornodeSelector:
nodetype: coordinatorcoordinator:
nodeSelector:
nodetype: coordinator
...
To understand the structure and values to use in the configurations, expand "Metadata Structure and Values" below:
Metadata Structure and Values
For global metadata:
annotations: {}
labels: {}
tolerations: []
nodeSelector: {}
For StatefulSet metadata:
coordinator:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: coordinator
executor:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: coordinator
catalog:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: catalog
catalogservices:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: catalogservices
mongodb:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: mongo
opensearch:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: operators
oidcProxy:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodeType: utils
preInstallJob:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodeType: jobs
nats:
podTemplate:
merge:
spec:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: nats
mongodbOperator:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: operators
opensearchOperator:
annotations: {}
labels: {}
tolerations: []
nodeSelector:
nodetype: operators
Configuring Extra Environment Variables
Optionally, you can define extra environment variables to be passed to either Coordinators or Executors. This can be done by adding the configuration under the parents as shown in the following example:
coordinator:
extraEnvs:
- name: <my-variable-name>
value: "<my-variable-value>"
...
executor:
extraEnvs:
- name: <my-variable-name>
value: "<my-variable-value>"
...
Environment variables defined as shown will be applied to Executors of both Classic Engines and New Engines.
Advanced Load Balancer Configuration
Dremio will create a public load balancer by default, and the Dremio Client service will provide an external IP to connect to Dremio. For more information, see Connecting to the Dremio Console.
-
Private Cluster - For private Kubernetes clusters (no public endpoint), set
internalLoadBalancer: true
. Add this configuration under the parent as shown in the following example:service:
type: LoadBalancer
internalLoadBalancer: true
... -
Static IP - To define a static IP for your load balancer, set
loadBalancerIP: <your-static-IP>
. If unset, an available IP will be assigned upon creation of the load balancer. Add this configuration under the parent as shown in the following example:service:
type: LoadBalancer
loadBalancerIP: <my-desired-ip>
...tipThis can be helpful if DNS is configured to expect Dremio to have a specific IP.
-
Session Afinity - If leveraging Scale-out Coordinators, set
sessionAffinity: true
. Add this configuration under the parent as shown in the following example:service:
type: LoadBalancer
sessionAffinity: true
...
Additional Load Balancer Configuration for Amazon EKS in Auto Mode
If deploying Dremio to Amazon EKS (Elastic Kubernetes Service) in Auto Mode, you need to add service annotations for the load balancer to start (for more information, see Use Service Annotations to configure Network Load Balancers). Add this configuration under the parent as shown in the following example:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
...
Advanced TLS Configuration for OpenSearch
Dremio generates TLS certificates by default for OpenSearch and they are rotated monthly. However, if you want to have your own, you need to create two secrets containing the relevant certificates. The format of the secrets is different from the other TLS secrets shown on this page, and the tls.crt
, tls.key
, and ca.crt
files must be in PEM format. Use the example below as reference to create your secrets:
kubectl create secret generic opensearch-tls-certs \
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt
kubectl create secret generic opensearch-tls-certs-admin \
--from-file=tls.crt --from-file=tls.key --from-file=ca.crt
Add the snippet below to the values-overrides.yaml
file before deploying Dremio. Because OpenSearch requires TLS, if certificate generation is disabled, you must provide a certificate.
opensearch:
tlsCertsSecretName: <opensearch-tls-certs>
disableTlsCertGeneration: true
...
Advanced Configuration of Engines
Dremio's default resource offset is reserve-2-8
, where the first value represents 2 vCPUs and the second represents 8 GB of RAM. If you need to change this default for your created engines, add the following snippet to values-overrides.yaml
and set the defaultOffset
to one of the configurable offsets listed below, which are available out of the box:
reserve-0-0
reserve-2-4
reserve-2-8
reserve-2-16
The listed values are keys and thus must be provided in this exact format into the snippet below.
engine:
options:
resourceAllocationOffsets:
defaultOffset: reserve-2-8
...
Configuration of Classic Engines
- You should only use classic engines if the new ones introduced in Dremio 26.0 are not appropriate for your use case. Classic and new engines are not intended to be used side by side.
- Classic engines will not auto-start/auto-stop, which is only possible with the new engines.
The classic way of configuring engines is still supported, and you can add this snippet to values-overrides.yaml
as part of the deployment. Note that this snippet is a configuration example, and you should adjust the values to your own case.
executor:
resources:
requests:
cpu: "16"
memory: "120Gi"
limits:
memory: "120Gi"
engines: ["default"]
count: 3
volumeSize: 128Gi
cloudCache:
enabled: true
volumes:
- size: 128Gi
...
Telemetry
Telemetry egress is enabled by default. These metrics provide visibility into various components and services, ensuring optimal performance and reliability. To disable egress add the following to your values-override.yaml
:
telemetry:
enabled: false
...
Disabling Parts of the Deployment
You can disable some components of the Dremio platform if their functionality does not pertain to your use case. Dremio's functionality will continue to work if any of these components described in this section are disabled.
Semantic Search
To disable Semantic Search, add this configuration under the parent as shown in the following example:
opensearch:
enabled: false
replicas: 0
Additional Configuration
Dremio has several configuration and binary files to define the behavior for enabling authentication via an identity provider, logging, connecting to Hive, etc. During the deployment, these files are combined and used to create a Kubernetes ConfigMap. This ConfigMap is, in turn, used by the Dremio deployment as the source of truth for various settings. Options can be used to embed these in the values-override.yaml
add configuration files.
To inspect Dremio's configuration files or perform a more complex operation not shown here, see Downloading Dremio's Helm Charts.
Additional Config Files
Use the configFiles
option to add configuration files into your Dremio deployment. You can add multiple files, each is a key value pair. The key is the file name and value the file content. These can be TXT, XML or JSON files. For example, here is how to embed the configuration for Hashicorp Vault followed by separate example file:
dremio:
configFiles:
vault_config.json: |
{
"vaultUrl": "https://my-vault.com",
"namespace": "optional/dremio/global/vault/namespace",
"auth": {
"kubernetes": {
"vaultRole": "dremio-vault-role",
"serviceAccountJwt": "file:///optional/custom/path/to/serviceAccount/jwt",
"loginMountPath": "optional/custom/kubernetes/login/path"
}
}
}
another_config.json: |
{
"key in this file": "content of this key"
}
...
Additional Config Variables
Use the dremioConfExtraOptions
option to add new variables to your Dremio deployment. For example, here is how to enable TLS between executors and coordinators, leveraging auto-generated self-signed certificates.
dremio:
dremioConfExtraOptions:
"services.fabric.ssl.enabled": true
"services.fabric.ssl.auto-certificate.enabled": true
...
Additional Config Binary Files
Use the configBinaries
option to provide binary configuration files (encoded as base64). For example, a JKS file for a custom truststore. The key is the file name, and the value is the file content. Add this configuration under the parents as shown in the following example:
dremio:
configBinaries:
custom-truststore.jks: "base64EncodedBinaryContent"
...
Additional Advanced Configs
Use the advancedConfigs
option to enable advanced configurations and their details. Add this configuration under the parent as shown in the following example illustrating an advanced configuration to provide a password if your custom trust store has one:
dremio:
advancedConfigs:
trustStore:
enabled: true
password: "<my-truststore-pass>"
Hive
Use the hive2ConfigFiles
option to configure Hive 2. Add this configuration under the parents as show in the following example:
dremio:
hive2ConfigFiles:
hive-site.xml: |
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<n>hive.metastore.uris</n>
<value>thrift://hive-metastore:9083</value>
</property>
</configuration>
...
Use the hive3ConfigFiles
option to configure Hive 3. Add this configuration under the parents as show in the following example:
dremio:
hive3ConfigFiles:
hive-site.xml: |
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<n>hive.metastore.uris</n>
<value>thrift://hive3-metastore:9083</value>
</property>
</configuration>
...
References
Recommended Resources Configuration
The table in this section contains the recommended values for resources requests and volume size to configure Dremio components. In the values-overrides.yaml
file, set the following values:
resources:
requests:
memory: # Put here the value from the Memory column.
cpu: # Put here the value from the CPU column.
volumeSize: # Put here the value from the Volume Size column, if any.
Dremio recommends the following configuration values:
- Production Configuration
- Minimal Configuration
Dremio recommends the following configuration values to operate in a production environment:
Dremio Component | Memory | CPU | Volume Size | Pod Count |
---|---|---|---|---|
Coordinator | 64Gi | 32 | 512Gi | 1 |
Catalog Server | 8Gi | 4 | - | 1 |
Catalog Server (External) | 8Gi | 4 | - | 1 |
Catalog Service Server | 8Gi | 4 | - | 1 |
Engine Operator | 1Gi | 1 | - | 1 |
OpenSearch | 16Gi | 2 | 100Gi | 3 |
MongoDB | 4Gi | 8 | 512Gi1 | 3 |
NATS | 1Gi | 700m | - | 3 |
ZooKeeper | 1Gi | 500m | - | 3 |
Open Telemetry | 1Gi | 1 | - | 1 |
M Engine | 120Gi | 16 | 521Gi | 4 |
1 You can use a smaller volume size if you do not heavily use Iceberg.
The following configuration will deploy a functional Dremio Platform and is appropriate for a single user to evaluate Dremio's various features. For any multi-user and performance-oriented evaluation, Dremio recommends the Production Configuration.
Dremio Component | Memory | CPU | Volume Size | Pod Count |
---|---|---|---|---|
Coordinator | 8Gi | 2 | 20Gi | 1 |
Catalog Server | 1Gi | 1 | - | 1 |
Catalog Server (External) | 1Gi | 1 | - | 1 |
Catalog Service Server | 1Gi | 1 | - | 1 |
Engine Operator | 1Gi | 1 | - | 1 |
OpenSearch | 1Gi | 1500m | 4Gi | 3 |
MongoDB | 1Gi | 1 | 4Gi | 3 |
NATS | 1Gi | 700m | - | 3 |
ZooKeeper | 1Gi | 500m | - | 1 |
Open Telemetry | 1Gi | 1 | - | 1 |
XS Engine | 8Gi | 2 | 20Gi | 1 |
Expand the widget below for Dremio platform components resource YAML snippets:
Dremio Platform Resource Configuration YAML
coordinator:
resources:
requests:
cpu: "32"
memory: "64Gi"
limits:
memory: "64Gi"
volumeSize: "512Gi"
catalog:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
catalogservices:
resources:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
opensearch:
resources:
requests:
memory: "16Gi"
cpu: "2"
limits:
memory: "16Gi"
cpu: "2"
mongodb:
resources:
requests:
cpu: "2"
memory: "2Gi"
limits:
cpu: "4"
memory: "2Gi"
storage:
resources:
requests:
storage: "512Gi"
nats:
resources:
requests:
cpu: "500m"
memory: "1024Mi"
limits:
cpu: "750m"
memory: "1536Mi"
zookeeper:
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
memory: "1Gi"
volumeSize: "10Gi"
telemetry:
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "2"
memory: "2Gi"