Administering Dremio on AKS

This topic discusses administration activities such as log monitoring, pod scaling, configuration changes, basic administrative tasks (backup, restore, clean, and so on), and Dremio upgrading.

Monitoring Logs

Dremio logs can be viewed by using one of the following methods:

  • Container console (stdout)
  • Azure AKS container

Using the Container Console

Logs are written to the container's console (stdout). These logs can be monitored using the kubectl command. All the logs (server.log, server.out, server.gc and access.log) are written to the console simultaneously. You can view the logs using the kubectl logs or kubectl logs -f command.

kubectl logs <container-name>
kubectl logs -f <container-name>

Using the Azure AKS Container

Azure provides integration with AKS clusters and Azure Log Analytics to monitor container logs. This is a standard practice that puts infrastructure in place to aggregate logs from containers into a central log store and analyze them.

Azure AKS log monitoring is useful for the following reasons:

  • Monitoring logs across lots of pods can be overwhelming.
  • When a pod (for example, a Dremio executor) crashes and restarts, only the logs from the last pod is available.
  • If a pod is crashing regularly, the logs are lost which makes it difficult to analyze the reasons for the crash.

To enable log monitoring

You can enable log monitoring either when creating a AKS cluster and after the cluster has been created. Once logging is enabled, all your container stdout and stderr logs are collected by the infrastructure for you to analyze.

While creating a AKS cluster, enable container monitoring. You can use can existing Log Analytics workspace or create a new one.

Enabling Container Monitoring during Cluster Creation

In an existing AKS cluster where monitoring was not enabled during creation, go to Logs on the AKS cluster and enable it.

Enabling Monitoring to Existing Cluster

To view container logs

To view all the container logs:

  1. Go to the Monitoring > Logs.
  2. Use the filter option to see the logs from the containers that you are interested in.

Viewing Container Logs

Scaling Up/Down

When you scale up or down the number of Dremio's pods (master-coordinator, executors, or slave-coordinators), you are changing the number of Dremio pods. All scaling values remain in affect until another helm upgrade command is run.

  1. Obtain the name of the helm chart release with the helm list command. For example:
     helm list
     NAME                 REVISION    UPDATED                     STATUS      CHART           NAMESPACE
     plundering-alpaca    1           Wed Jul 18 09:36:14 2018    DEPLOYED    dremio-0.0.5    default
    
  2. Run the helm upgrade --wait <chart release name> . --set <dremio pod=value> command.

For example, Dremio executor pods could be scaled up/down with the following commands where plundering-alpaca is the chart release name and the pod count is 5:

helm upgrade --wait plundering-alpaca . --set executor.count=5

Resetting to Defaults

Once you scale up/down a Dremio pod, if helm upgrade is run again (either for scaling, changing your configuration, or upgrading) the configuration resets to the defaults specified in the values.yaml file.

All scaling values remain in affect until another helm upgrade command is run. When a subsequent helm upgrade command is run, values are reset to the default in the value.yaml file. For example, if you scale up the slave-coordinators to 3 and then scale up the executors to 5, the slave-coordinator is reset to 0 (default) after the executor is scaled up/down to 5

[warning] Scale to Zero
If you scale all of the Dremio pods down to zero (0), you are effectively shutting down the Dremio cluster.

To permanently change your default values, update the values.yaml file. See Changing your Configuration for more information.

Performance

If you scale down the number of pods (either temporarily or permanently), already created reflections may need to be re-created during the next refresh. This could result in a delay up-to-date data reflections.

Changing your Configuration

If you need to update your configuration, this can be done after installation by re-editing the configuration files and then upgrading using the helm upgrade <chart release name> . command. The upgrade process pushes out your changes to all of the pods in your Kubernetes cluster and restarts the pods.

For example, if you want to permanently increase the number of Dremio executor pods to five (5):

  1. Edit the values.yaml file and change the number of executor pods via executor.count. In this example, executor.count is 5 and the other executor defaults remain unchanged.
     executor:
       memory: 16384
       cpu: 4
       count: 5
       volumeSize: 20Gi
    
  2. Run the helm upgrade --wait <chart release name> . command. In this example, plundering-alpaca is the chart release name:
     helm upgrade --wait plundering-alpaca .
    

[info] Tip

If it takes longer than a couple of minutes to complete, check the status of the pods via kubectl get pods.

If the pods are pending scheduling due to limited memory or cpu, either re-adjust the values in the values.yaml file or add more resources to your Kubernetes cluster.

Dremio Admin Commands

The following are the Dremio administration command that can be run on the Dremio+Kubernetes cluster. All commands except for the Dremio backup command required that Dremio be shutdown/offline.

Command Offline/Online Notes
backup online /opt/dremio/bin/dremio-admin backup
See Backup Dremio for more information.
clean offline /opt/dremio/bin/dremio-admin clean
See Metadata Cleanup for more information.
restore offline /opt/dremio/bin/dremio-admin restore
See Restore Dremio for more information.
set-password offline /opt/dremio/bin/dremio-admin set-password
See Reset Password for more information.

Backup

The backup command is run when Dremio is online. It is run on the master-coordinator pod from a bash shell.

To run the backup command:

  1. Connect to the master-coordinator pod using the exec command.
    kubectl exec -it dremio-master-0 -- bash
  2. Run the command from the bash shell. See Backup Dremio for more information.
      /opt/dremio/bin/dremio-admin backup \
       -u <DREMIO_ADMIN_USER> \
       -p <DREMIO_ADMIN_PASS> \
       -d <BACKUP_PATH>
    
  3. Store the backup files in some persistent volume or copy the files out of the local pod.

Clean, Restore, and Set-Password

The following Dremio commands are offline commands, that is, Dremio must not be running.

  • clean
  • restore
  • set-password

[info] To temporarily shut down Dremio, scale down the master-coordinator to zero (0).

These offline commands are run by creating a Dremio Admin pod with the Dremio image and mounting the master-coordinator pod's persistent volume.

To run offline dremio-admin commands:

  1. Create a Dremio Admin pod to run the dremio-admin commands. Note that if you have updated the value of the image, the image value must also be updated in the dremio-admin-pod.yaml file.
    kubectl apply -f dremio-admin-pod.yaml --wait
  2. Run the dremio-admin commands from the bash shell on the Dremio Admin pod. See Advanced Administration for more information on each command.
    kubectl exec -it dremio-admin -- bash
    bin/dremio-admin <offline command>
  3. Delete the pod.
    kubectl delete pod dremio-admin

Upgrading Dremio

You upgrade Dremio by updating the image value in the values.yaml file to the new Dremio version and running the helm upgrade command.

During the upgrade process, existing pods are terminated and new pods are created with the new image. Once all the newly created pods are restarted and running, your Dremio cluster is upgraded.

To upgrade Dremio:

  1. Ensure that your Dremio+Kubernetes cluster is backed up. See Backup for more information.
  2. Ensure that there are no queries are running on the cluster.
  3. Update the Dremio image tag in your values.yaml file.
    For example, to change the Dremio CE image:
     image: dremio/dremio-oss:3.0.0
     ...
    
    Note: If you are changing the Dremio Enterprise Edition image, you do not need to change the imagePullSecrets property.
  4. Get the chart release name with helm list command. In the example below, the chart release name is plundering-alpaca.

     helm list
     NAME                 REVISION    UPDATED                     STATUS      CHART           NAMESPACE
     plundering-alpaca    1           Wed Jul 18 09:36:14 2018    DEPLOYED    dremio-0.0.5    default
    
  5. Run the helm upgrade --wait <chart release name> . command to upgrade the deployment.
    In this example, plundering-alpaca: helm upgrade plundering-alpaca .

[info] Tip

The pods are restarted automatically after upgrading. If it takes longer than a couple of minutes to complete, check the status of the pods via kubectl get pods.

If the pods are pending scheduling due to limited memory or cpu, either adjust the values in the values.yaml file (see Changing your Configuration) or add more resources to your Kubernetes cluster.


results matching ""

    No results matching ""