Administer Dremio on Kubernetes
This section includes topics about administering Dremio on supported Kubernetes environments, including information about monitoring logs, scaling pods, changing configurations, performing basic administrative tasks such as backing up, restoring, cleaning, and upgrading Dremio.
Monitoring Logs and Usage
Monitoring the cluster's resource usage (e.g., heap and direct memory, CPU, disk I/O, etc.) is crucial to maintaining long-term stability as the system scales. For this reason, Dremio recommends setting up a monitoring stack, such as Prometheus and Grafana. For a detailed setup tutorial and an overview of which metrics to track, see Dremio Monitoring in Kubernetes. For more information, see this PDF guide on the Dremio Enterprise Edition (Software) Shared Responsibility Model.
Managing Workloads
Most workloads can be handled with a Large (8 executors) or X-Large (12 executors) engine, each with 32 CPUs per executor. Larger engine sizes may be required for certain workloads. Over-parallelization of queries can cause performance degradation. Thus, packing workloads of all shapes or sizes onto a few very large engines is ill-advised. Workloads should be divided into high-cost and low-cost queries, and dedicated queues should be configured for tasks such as Reflections, metadata refresh, and table optimization jobs. These can then be divided between right-sized engines. For more information, see Dremio's Well-Architected Framework.
Changing Your Configuration
If you need to update your configuration, you can do so after the installation by editing the configuration files and then upgrading using an upgrade command, for example:
helm upgrade <chart release name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <helm-chart-version>
The upgrade process pushes your changes to all pods in your Kubernetes cluster and restarts the pods.
For example, to permanently change the resources of your coordinator pod:
-
Edit the
values-overrides.yaml
file and change the resources specified for the coordinator. In this example,memory
is32Gi
andcpu
is8
.coordinator:
resources:
limits:
memory: 32Gi
requests:
cpu: 8
memory: 32Gi -
Run the upgrade command. Replacing the template values:
helm upgrade <chart release name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <helm-chart-version>
noteIf the command takes longer than a few minutes to finish, check the status of the pods with the
kubectl get pods
command. If the pods are pending scheduling due to limited memory or CPU, adjust the values you specified for the properties in thevalues-overrides.yaml
file or add more resources to your Kubernetes cluster.
Using Support Keys
Use support keys only when instructed by Dremio Support. If misused, they can alter the application's behavior and lead to unexpected failures.
Using the Dremio Admin CLI on Kubernetes
The Dremio Admin CLI is the mechanism to back up, restore, add internal users, etc. For more information on the various commands the see CLI reference previously linked. In order to run the CLI commands you need to access either the dremio-master-0
or dremio-admin
pod. This requires the use of the kubectl
command line tool and access to the Kubernetes cluster and namespace where Dremio is deployed.
The term master
is a legacy label used in this command. We now refer to this as the main coordinator pod.
Some CLI commands like Backup require Dremio to be online. This means Dremio must be deployed normally per Deploying Dremio to Kubernetes. When inspecting Dremio's pods, dremio-master-0
must be present and RUNNING
to be considered online.
Some CLI commands like Clean require Dremio to be offline. To use them, Dremio must be deployed and running in admin mode. If not, you must redeploy Dremio in admin mode. The requirements section for each command will note whether Dremio should be online or offline. If it is not mentioned, then the command will work in either case.
To redeploy Dremio in admin mode, you must run a helm upgrade
command where the DremioAdmin
flag is set to true
. Here is a templated example command:
helm upgrade <chart-release-name> oci://quay.io/dremio/dremio-helm -f <your-local-path>/values-overrides.yaml --version <helm-chart-version> --set DremioAdmin=true
This command will cause the shutdown of the Coordinators and Executors. In their place will start the dremio-admin
pod. Crucially, this pod will mount the dremio-master-0
volume allowing for operations on the constituent KV store.
To get command line access to the dremio-master-0
, dremio-admin
, or any pod for that matter, you would use the kubectl exec
command. Here is an example using the -it
option for interactive, and the -- bash
option to enter a bash session:
kubectl exec -it <pod-name> -- bash
Once you've entered the pod, you can run typical shell commands to explore the file system and execute commands. For more information, see kubectl exec.
The dremio-admin
utility is within the /opt/dremio/bin
directory of both the main and admin pods and can be used to execute the various Dremio Admin CLI commands.
To exit Dremio admin mode and restart the normal service, you must redeploy Dremio again using the command above and setting only DremioAdmin=false
.