Version: current [26.x]

Administer Dremio on Kubernetes

This section includes topics about administering Dremio on supported Kubernetes environments, including information about monitoring logs, scaling pods, changing configurations, performing basic administrative tasks such as backing up, restoring, cleaning, and upgrading Dremio.

Monitoring Logs and Usage

Monitoring the cluster's resource usage (e.g., heap and direct memory, CPU, disk I/O, etc.) is crucial to maintaining long-term stability as the system scales. For this reason, Dremio recommends setting up a monitoring stack, such as Prometheus and Grafana. For a detailed setup tutorial and an overview of which metrics to track, see Dremio Monitoring in Kubernetes. For more information, see this PDF guide on the Dremio Enterprise Edition (Software) Shared Responsibility Model.

Managing Workloads

Most workloads can be handled with a Large (8 executors) or X-Large (12 executors) engine, each with 32 CPUs per executor. Larger engine sizes may be required for certain workloads. Over-parallelization of queries can cause performance degradation. Thus, packing workloads of all shapes or sizes onto a few very large engines is ill-advised. Workloads should be divided into high-cost and low-cost queries, and dedicated queues should be configured for tasks such as Reflections, metadata refresh, and table optimization jobs. These can then be divided between right-sized engines. For more information, see Dremio's Well-Architected Framework.

Changing Your Configuration

If you need to update your configuration, you can do so after the installation by editing the configuration files and then upgrading using an upgrade command, for example:

helm upgrade <chart release name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <helm-chart-version>

The upgrade process pushes your changes to all pods in your Kubernetes cluster and restarts the pods.

For example, to permanently change the resources of your coordinator pod:

Edit the values-overrides.yaml file and change the resources specified for the coordinator. In this example, memory is 32Gi and cpu is 8.

coordinator:
    resources:
      limits:
        memory: 32Gi
      requests:
        cpu: 8
        memory: 32Gi

Run the upgrade command. Replacing the template values:
```
helm upgrade <chart release name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <helm-chart-version>
```
note
If the command takes longer than a few minutes to finish, check the status of the pods with the kubectl get pods command. If the pods are pending scheduling due to limited memory or CPU, adjust the values you specified for the properties in the values-overrides.yaml file or add more resources to your Kubernetes cluster.

Using Support Keys

Use support keys only when instructed by Dremio Support. If misused, they can alter the application's behavior and lead to unexpected failures.

Using the Dremio Admin CLI on Kubernetes

The Dremio Admin CLI is the mechanism to back up, restore, add internal users, etc. For more information on the various commands the see CLI reference previously linked. In order to run the CLI commands you need to access either the dremio-master-0 or dremio-admin pod. This requires the use of the kubectl command line tool and access to the Kubernetes cluster and namespace where Dremio is deployed.

note

The term master is a legacy label used in this command. We now refer to this as the main coordinator pod.

Some CLI commands like Backup require Dremio to be online. This means Dremio must be deployed normally per Deploying Dremio to Kubernetes. When inspecting Dremio's pods, dremio-master-0 must be present and RUNNING to be considered online.

Some CLI commands like Clean require Dremio to be offline. To use them, Dremio must be deployed and running in admin mode. If not, you must redeploy Dremio in admin mode. The requirements section for each command will note whether Dremio should be online or offline. If it is not mentioned, then the command will work in either case.

To redeploy Dremio in admin mode, you must run a helm upgrade command where the DremioAdmin flag is set to true. Here is a templated example command:

helm upgrade <chart-release-name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <helm-chart-version> --set DremioAdmin=true

This command will cause the shutdown of the Coordinators and Executors. In their place will start the dremio-admin pod. Crucially, this pod will mount the dremio-master-0 volume allowing for operations on the constituent KV store.

To get command line access to the dremio-master-0, dremio-admin, or any pod for that matter, you would use the kubectl exec command. Here is an example using the -it option for interactive, and the -- bash option to enter a bash session:

kubectl exec -it <pod-name> -- bash

Once you've entered the pod, you can run typical shell commands to explore the file system and execute commands. For more information, see kubectl exec. The dremio-admin utility is within the /opt/dremio/bin directory of both the main and admin pods and can be used to execute the various Dremio Admin CLI commands.

To exit Dremio admin mode and restart the normal service, you must redeploy Dremio again using the command above and setting only DremioAdmin=false.

Upgrading Dremio

note

This section assumes you're running Dremio 26.0+.

To upgrade the Dremio platform, update the Helm chart version to the most recent, which tags the version of Dremio you want to upgrade to. The Dremio release notes will provide the corresponding Helm chart version. There will be Dremio Helm chart releases that do not upgrade Dremio but update some other component. However, with every Dremio release, there will be a Helm chart release where the image tags for the various services are updated. Enterprise customers can view a list of all Helm charts and Image tags on quay.io.

During the upgrade process, existing pods are terminated and new pods are created with the new images. After all the newly created pods are restarted and running, your Dremio cluster is upgraded.

tip

If you do not know your Helm chart release name, use helm list to list the Helm deployments in a selected namespace.

To upgrade Dremio:

Ensure that your Dremio is backed up. For more information, see Backup.
Ensure that no queries are running on the cluster, as any running queries will fail when services start terminating.

Construct the appropeate helm upgrade command, for example:

helm upgrade <chart-release-name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <new-desired-version>

Execute the helm upgrade command.
Pods will begin restarting with the new images and, once finished, Dremio will be accessible.

note

The job results cleanup optimization uses a secondary index to optimize the results cleanup, which implies a one-time reindexing of the jobs table during the upgrade.

The reindexing duration depends on the total number of jobs stored in the KV store. In environments with a large volume of jobs, this can increase the overall upgrade time.

Upgrading to Dremio 26.0+

note

This section assumes you're running Dremio 24 or 25, and are trying to upgrade to Dremio 26.0+.

For Enterprise customers, version Dremio 26 brought the v3 Helm charts with it. The former v2 Helm charts, distributed via the dremio-cloud-tool GitHub, used for Dremio versions 24 and 25, are not compatible with version 26.

It is possible to upgrade an existing deployment. However, Enterprise customers need to migrate from the v2 Helm charts to the v3 Helm charts before any upgrade can take place. The v3 Helm charts are distributed via our image repository Quay.io.
Customers must move the relevant content from their existing value.yaml (and any other deployment-specific configurations like Identity Provider authentication) into the new values-overrides.yaml configuration file, as detailed in Configuring your Values.
Some configurations can be left behind. For example, the new UI experience has superseded the executor configuration in the charts. For more information, see Managing Engines in Kubernetes.

warning

Skip the next paragraph if you did not use the Executor HPA and node life cycle policy.

Before upgrading to Dremio 26.0+, if you intend to continue to use Classic Engines, the no longer supported node life cycle policy should be disabled. To check for this option, look at the executor section in your old Helm Charts values.yaml and see if node_lifecycle_service_enabled: true is set. If it's set to true change it to false and redeploy Dremio. If it's not present, that is the same as false. Despite this if post upgrade you note the Executors of a Classic Engine marked as paused on the node activity panel you can resolve this with a call to Dremio's Blacklist API see, Allowing all Nodes.

Once the new values-overrides.yaml and other deployment configurations are prepared, you can proceed with the upgrade.

For help with this process, please reach out to Dremio Support and your Account Executive. More detailed guides and help from Dremio's professional services team can be provided.

To upgrade Dremio:

Ensure you have created a new values-overrides.yaml configuration file with relevant values from your existing deployment ported over per Configuring your Values
Ensure that your Dremio is backed up. For more information, see Backup.
Ensure that no queries are running on the cluster, as any running queries will fail when services start terminating.
Uninstall your existing Dremio deployment:
```
helm uninstall <chart-release-name>
```
This will delete existing pods and remove other elements of the existing Dremio deployment. Crucially, it will not delete the dremio-master-0 volume, which contains the KV store and Dremio's state.

note
The term master is a legacy label. We now refer to this as the main coordinator pod.
Confirm the dremio-master-0 volume still exists in the namespace you want to reinstall Dremio. This can be confirmed with:
```
kubectl get pvc --namespace <dremio-install-namespace>
```
Each of your executors should have left behind two volumes, but the main should have left only one.
We're now ready to install Dremio 26.0+. Follow the instructions in Deploying Dremio to Kubernetes to complete the installation.

warning

Dremio must be deployed to the same location as the previous version to mount the dremio-master-0 volume. It's the content of this volume that is being upgraded.

note

The job results cleanup optimization uses a secondary index to optimize the results cleanup, which implies a one-time reindexing of the jobs table during the upgrade.

The reindexing duration depends on the total number of jobs stored in the KV store. In environments with a large volume of jobs, this can increase the overall upgrade time.

Monitoring Logs and Usage​

Managing Workloads​

Changing Your Configuration​

Using Support Keys​

Using the Dremio Admin CLI on Kubernetes​

Upgrading Dremio​

Upgrading to Dremio 26.0+​

Monitoring Logs and Usage

Managing Workloads

Changing Your Configuration

Using Support Keys

Using the Dremio Admin CLI on Kubernetes

Upgrading Dremio

Upgrading to Dremio 26.0+