Skip to main content
Version: current [24.3.x]

Monitoring

As an administrator, you can monitor logs, system telemetry, cluster usage, jobs, and Dremio nodes.

Logs

Learn more about the different types and file locations for logs, and consider exporting your logs to a central location for analysis.

Default Log File Locations

By default, Dremio uses the following locations to write logs:

  • Tarball - <DREMIO_HOME>/log
  • RPM - /var/log/dremio

Log Types

Audit Logs

All user activities performed within Dremio are tracked in the audit.json file. For details, see Audit Logging.

System Logs

The following logs are enabled by default:

  • access.log - HTTP access log for the Dremio web server. This log will be generated by coordinator nodes only.
  • server.gc - Garbage collection log.
  • server.log and json/server.json - Server logs are generated in a text format (server.log) and a json format (json/server.json). An admin can disable one of these formats.
  • server.out - Log for Dremio daemon standard out.
  • metadata_refresh.log - Log for refreshing metadata.
  • tracker.json- Tracker log.

Query Logs

Query logs are located in the queries.json file. This file contains a log of completed queries; it does not include queries currently in planning or execution.

Query logging is enabled by default.

note

Query logs can be queried by Dremio itself or another tool for monitoring and analytics.

Format

Query logs include the following information:

  • queryId - Unique ID of the executed query.
  • queryText- SQL query text.
  • start - Start time of the query.
  • finish - End time of the query.
  • outcome - Whether the query was completed or failed.
  • username - User that executed the query.
  • commandDescription - Type of the command. This may be a regular SQL query execution job or another SQL command.

Additional information may be found depending on your Dremio configuration.

Warning Logs

Warnings can be generated in the hive.deprecated.function.warning.log for Hive functions that have been deprecated. If you see a warning generated in this log, locate the deprecated function and replace it with a supported function. For example, you would replace NVL with COALESCE.

Retrieving Logs from Kubernetes

To retrieve logs from Kubernetes, use the container console for Amazon Elastic Container Service for Kubernetes (EKS), Azure Kubernetes Service (AKS), or Google Kubernetes Engine (GKE). You can also use the AKS container to retrieve logs for AKS.

Using the Container Console

All logs are written to the container's console (stdout) simultaneously. These logs can be monitored using either kubectl command:

Command for viewing logs using kubectl logs
kubectl logs <container-name>
Command for viewing logs using kubectl logs -f
kubectl logs -f <container-name>

Using the AKS Container

Azure provides integration with AKS clusters and Azure Log Analytics to monitor container logs. This is a standard practice that puts infrastructure in place to aggregate logs from containers into a central log store to analyze them.

AKS log monitoring is useful for the following reasons:

  • Monitoring logs across lots of pods can be overwhelming.
  • When a pod (for example, a Dremio executor) crashes and restarts, only the logs from the last pod are available.
  • If a pod is crashing regularly, the logs are lost, which makes it difficult to analyze the reasons for the crash.

For more information regarding AKS, see Azure Monitor features for Kubernetes monitoring.

Enabling Log Monitoring

You can enable log monitoring when creating a AKS cluster or after the cluster has been created.

Once logging is enabled, all your container stdout and stderr logs are collected by the infrastructure for you to analyze.

  1. While creating a AKS cluster, enable container monitoring. You can use can existing Log Analytics workspace or create a new one.
  2. In an existing AKS cluster where monitoring was not enabled during creation, go to Logs on the AKS cluster and enable it.

If you want to persist logs in the PVC, follow the instructions here.

Viewing Container Logs

To view all the container logs:

  1. Go to Monitoring > Logs.
  2. Use the filter option to see the logs from the containers that you are interested in.

Cluster Usage

Dremio displays the number of unique users who executed jobs on that day and the number of executed jobs.

  1. Hover over the help icon in the left navigation bar.

  2. Click on About Dremio in the menu.

  3. Click on the Cluster Usage Data tab.

System Telemetry

Dremio exposes system telemetry metrics in Prometheus format by default. It is not necessary to configure an exporter to collect the metrics. Instead, you can specify the host and port number where metrics are exposed in the dremio.conf file and scrape the metrics with any Prometheus-compliant tool.

To specify the host and port number where metrics are exposed, add these two properties to the dremio.conf file:

  • services.web-admin.host: set to the desired host address (typically 0.0.0.0 or the IP address of the host where Dremio is running).
  • services.web-admin.port: set to any desired value that is greater than 1024.

For example:

Example host and port settings in dremio.conf
services.web-admin.host: "127.0.0.1"
services.web-admin.port: 9090

Restart Dremio after you update the dremio.conf file to make sure your changes take effect.

Access the exported Dremio system telemetry metrics at http://<yourHost>:<yourPort>/metrics.

For more information about Prometheus metrics, read Types of Metrics in the Prometheus documentation.