Kubernetes Troubleshooting

This topic discusses Kubernetes troubleshooting scenarios as they pertain to Azure AKS and Amazon EKS environments.

See Azure AKS and Amazon EKS for deploying, customing, and administering information.

Why are my edits to files in the config directory not being applied?

Problem
I'm making changes to the configuration files in the config directory, but the changes are not showing up on the pods.

Explanation
The config directory cannot have any binary files.

In the process of deployment, the contents of the config directory are copied to all the pods at the /opt/dremio/conf location, and then the configmap is created and made available on the pod.

If binary files exist in the config directory, then the creation of the configmap fails.

Solution
Ensure that there are no binary files in the config directory and re-deploy.

Why are my pods still not being provisioned for lack of CPU/memory?

Problem
I asked for 5 executors and have 5 nodes in my Kubernetes cluster that should be able to satisfy the CPU/memory requirements, but I'm still running into lack of CPU/Memory issues.

Explanation
Along with the executor pods, the deployment also creates the following pods which need to be accounted for when calculating CPU/memory requirements for the Dremio cluster.

  • Dremio master-coordinator pod (requires an allocated node)
  • Zookeeper pods (requires a small amount of resources)

The number of allocated nodes in the cluster must be equivalent to the number of Dremio executors plus one (1) for the Dremio master-coordinator.

Solution
Allocate an additional node (1 node) in the cluster for the Dremio master-coordinator pod.

Why is data from an old deployment still around?

Problem
I deleted my Dremio deployment (helm delete <helm-release>), but when I install a new release, data from the old deployment is still around.

Explanation
The helm chart uses scalesets for Dremio pods. In Kubernetes, any associated persistent volume with a pod in a scaleset is not deleted when you delete the scaleset.

Solution
To completely delete the data, you need to delete the persistent volumes. For example:

kubectl get pvc
kubectl delete pvc dremio-master-volume-dremio-master-0

results matching ""

    No results matching ""