Skip to main content
Version: current [25.x]

Administering Dremio on AKS

This topic discusses administration activities such as monitoring logs; scaling pods; changing configurations; performing basic administrative tasks such as backing up, restoring, and cleaning; and upgrading Dremio.

The example commands listed below assume that the current command line location is within the latest Dremio Helm chart, dremio-cloud-tools/charts/dremio_v2, on the client machine that interacts with Kubernetes.

note

You must maintain any changes you make to the Helm values or configuration files in the dremio-cloud-tools/charts/dremio_v2 directory in your local copy of dremio-cloud-tools.

Monitoring Logs and Usage

Monitoring the cluster's resource usage (e.g., heap and direct memory, CPU, disk I/O, etc.) is crucial to maintaining long-term stability as the system scales. For this reason, it is highly recommended to set up a monitoring stack, such as Prometheus and Grafana. For a detailed setup tutorial and an overview on which metrics to track, see Dremio Monitoring in Kubernetes. For more information, see this PDF guide on the Dremio Shared Responsibility Model.

For monitoring logs, see Logs for more information. You can retrieve logs from the Dremio console or directly from Kubernetes. You can also write logs to a file on disk in addition to stdout. Read Writing Logs to a File for details.

Managing Workloads

Limit engine sizes to a maximum of 10 executor pods (with 32 CPUs and 128 GB of memory) to prevent over-parallelization of queries. Workloads should be split into high-cost and low-cost queries, and dedicated queues should be configured for reflections, metadata refresh, and table optimization jobs. For more information, see Dremio's Well-Architected Framework.

Scaling Up or Down

Scaling up or down refers to increasing or decreasing the number of Dremio pods (executors or scale-out coordinators). All scaling values remain in effect until you run another helm upgrade command.

warning

Scaling up and down the master-coordinator is not supported. Do not update the master-coordinator pod count from the default value, 1, as there must be exactly one master-coordinator in the Dremio cluster to maintain stability and ensure connectivity. Scaling to 0 effectively terminates the Dremio cluster.

note

Scaling down the number of executor pods, whether temporarily or permanently, may cancel queries if you are not using Dremio Enterprise Edition 24.3 or above with autoscaling.

  1. Run the helm list command to retrieve the chart release name. In the example below, the chart release name is plundering-alpaca.

    Get chart release name
    helm list
    NAME REVISION UPDATED STATUS CHART NAMESPACE
    plundering-alpaca 1 Wed Jul 18 09:36:14 2018 DEPLOYED dremio-0.0.5 default
  2. Run the helm upgrade --wait <chart_release_name> . --set <dremio pod=value> command. Replace <chart_release_name> with your chart release name. For example, Dremio executor pods could be scaled up or down with the following command, which changes the pod count to 5:

    Helm upgrade command example
    helm upgrade --wait <chart_release_name> . --set executor.count=5

Resetting to Defaults

After you scale up or down the number of Dremio pods, if you run helm upgrade again (whether for scaling, changing your configuration, or upgrading), the configuration resets to the defaults specified in the values.yaml file.

All scaling values remain in effect until you run the helm upgrade command. When you run a subsequent helm upgrade command, values are reset to the default in the value.yaml file. For example, if you scale up the secondary-coordinators to 3 and then scale up the executors to 5, the secondary-coordinator is reset to 0 (default) after the executor is scaled up to 5.

note

Scaling all of the Dremio pods down to 0 effectively shuts down the Dremio cluster.

To permanently change your default values, update the values.yaml file. See Changing your Configuration for more information.

Changing Your Configuration

If you need to update your configuration, you can do so after the installation by editing the configuration files and then upgrading using the helm upgrade <chart release name> . command. The upgrade process pushes your changes to all of the pods in your Kubernetes cluster and restarts the pods.

For example, to permanently change the number of Dremio executor pods:

  1. Edit the values.yaml file and change the number of executor pods specified for the executor.count property. In this example, executor.count is 5. The other executor defaults remain unchanged.

    Example executor property values
    executor:
    memory: 16384
    cpu: 4
    count: 5
    volumeSize: 20Gi
  2. Run the upgrade command. Replace <chart_release_name> with your chart release name:

    Helm upgrade command example
    helm upgrade --wait <chart release name> .
    note

    If the command takes longer than a few minutes to finish, check the status of the pods with the kubectl get pods command. If the pods are pending scheduling due to limited memory or CPU, adjust the values you specified for the properties in the values.yaml file or add more resources to your Kubernetes cluster.

Using Support Keys

Support keys should only be used when instructed by Dremio Support, as they can alter the application's behavior and lead to unexpected failures if misused.

Backing Up the KV Store

Dremio stores important metadata in a metastore, referred to as the KV store, which is local to the master coordinator node. Regular backups of the KV store are highly recommended. As of Dremio 25.1.0+, these backups can be automated or scheduled as a cron job. You can test the backup restore process by performing a full cluster restore every 6 to 12 months.

Dremio Admin Commands

You can run the Dremio administration commands listed in the table below on the Dremio Kubernetes cluster. Dremio must be shut down and offline to run all commands except the Dremio backup command.

CommandOffline/OnlineNotes
backuponline/opt/dremio/bin/dremio-admin backup
See Backup Dremio for more information.
cleanoffline/opt/dremio/bin/dremio-admin clean
See Metadata Cleanup for more information.
restoreoffline/opt/dremio/bin/dremio-admin restore
See Restore Dremio for more information.
set-passwordoffline/opt/dremio/bin/dremio-admin set-password
See Reset Password for more information.

Backup

Run the backup command on the master-coordinator pod from a bash shell. Dremio must be online to run the backup command.

To run the backup command:

  1. Connect to the master-coordinator pod using the exec command.

    Connect to master coordinator pod
    kubectl exec -it dremio-master-0 -- bash
  2. Run the command from the bash shell. See Backup Dremio for more information.

    Run bash shell command
     /opt/dremio/bin/dremio-admin backup \
    -u <DREMIO_ADMIN_USER> \
    -p <DREMIO_ADMIN_PASS> \
    -d <BACKUP_PATH>
  3. Store the backup files in a persistent volume or copy the files from the local pod.

Clean, Restore, and Set-Password

To run the clean, restore, and set-password commands, Dremio must be offline.

note

To temporarily shut down Dremio, delete the Dremio helm release or enable the DremioAdmin pod.

To run these offline commands, create a Dremio Admin pod with the Dremio image and mount the master-coordinator pod's persistent volume:

  1. Run the following command to create a Dremio Admin pod to run the dremio-admin commands. Replace <chart_release_name> with your chart release name:

    Create Dremio Admin pod
    helm upgrade --wait <chart release name> . --set DremioAdmin=true
  2. Run the dremio-admin commands from the bash shell on the Dremio Admin pod. See Advanced Administration for more information about each command. The following commands connect you to the pod and allow you to perform the offline command:

    Connect to pod and run offline command
    kubectl exec -it dremio-admin -- bash
    bin/dremio-admin <offline command>
  3. Upgrade helm to disable the DremioAdmin pod. Replace <chart release name> with your chart release name.

    Delete pod
    helm upgrade --wait <chart release name> . --set DremioAdmin=false
  4. Restart your Dremio cluster.

Upgrading Dremio

To upgrade Dremio, update the image value in the values.yaml file to the new Dremio version and run the helm upgrade command.

During the upgrade process, existing pods are terminated and new pods are created with the new image. After all of the newly created pods are restarted and running, your Dremio cluster is upgraded.

To upgrade Dremio:

  1. Ensure that your Dremio+Kubernetes cluster is backed up. See Backup for more information.

  2. Ensure that there are no queries are running on the cluster.

  3. Update the Dremio image tag in your values.yaml file. For example, to change the Dremio CE image:

    Change Dremio CE image
    image: dremio/dremio-oss
    imageTag: 11.0.0
    ...
    note

    If you are changing the Dremio Enterprise Edition image, you do not need to change the imagePullSecrets property.

  4. Run the helm list command to retrieve the chart release name. In the example below, the chart release name is plundering-alpaca.

    Get chart release name
    helm list
    NAME REVISION UPDATED STATUS CHART NAMESPACE
    plundering-alpaca 1 Wed Jul 18 09:36:14 2018 DEPLOYED dremio-0.0.5 default
  5. Run helm upgrade --wait <chart_release_name> . to upgrade the deployment. Replace <chart_release_name> with your chart release name.

    note

    The pods are restarted automatically after upgrading. If it takes longer than a couple of minutes to restart, check the status of the pods with the kubectl get pods command. If the pods are pending scheduling due to limited memory or CPU, adjust the values you specified for the properties in the values.yaml file (see Changing your Configuration) or add more resources to your Kubernetes cluster.