Monitoring
As an administrator, you can monitor logs, system telemetry, cluster usage, jobs, and Dremio nodes.
Logs
Learn more about the different types and file locations for logs, and consider exporting your logs to a central location for analysis.
Default Log File Locations
By default, Dremio uses the following locations to write logs:
- Tarball -
<DREMIO_HOME>/log
- RPM -
/var/log/dremio
Log Types
Audit Logs
All user activities performed within Dremio are tracked in the audit.json
file. For details, see Audit Logging.
System Logs
The following logs are enabled by default:
access.log
- HTTP access log for the Dremio web server. This log will be generated by coordinator nodes only.server.gc
- Garbage collection log.server.log
andjson/server.json
- Server logs are generated in a text format (server.log) and a json format (json/server.json). An admin can disable one of these formats.server.out
- Log for Dremio daemon standard out.metadata_refresh.log
- Log for refreshing metadata.tracker.json
- Tracker log.
Query Logs
Query logs are located in the queries.json
file.
This file contains a log of completed queries; it does not include queries currently in planning or execution.
Query logging is enabled by default.
Query logs can be queried by Dremio itself or another tool for monitoring and analytics.
Format
Query logs include the following information:
queryId
- Unique ID of the executed query.queryText
- SQL query text.start
- Start time of the query.finish
- End time of the query.outcome
- Whether the query was completed or failed.username
- User that executed the query.commandDescription
- Type of the command. This may be a regular SQL query execution job or another SQL command.
Additional information may be found depending on your Dremio configuration.
Warning Logs
Warnings can be generated in the hive.deprecated.function.warning.log
for Hive functions that have been deprecated. If you see a warning generated in this log, locate the deprecated function and replace it with a supported function. For example, you would replace NVL
with COALESCE
.
Retrieving Logs from Kubernetes
To retrieve logs from Kubernetes, use the container console for Amazon Elastic Container Service for Kubernetes (EKS), Azure Kubernetes Service (AKS), or Google Kubernetes Engine (GKE). You can also use the AKS container to retrieve logs for AKS.
Using the Container Console
All logs are written to the container's console (stdout) simultaneously. These logs can be monitored using either kubectl
command:
kubectl logs <container-name>
kubectl logs -f <container-name>
Using the AKS Container
Azure provides integration with AKS clusters and Azure Log Analytics to monitor container logs. This is a standard practice that puts infrastructure in place to aggregate logs from containers into a central log store to analyze them.
AKS log monitoring is useful for the following reasons:
- Monitoring logs across lots of pods can be overwhelming.
- When a pod (for example, a Dremio executor) crashes and restarts, only the logs from the last pod are available.
- If a pod is crashing regularly, the logs are lost, which makes it difficult to analyze the reasons for the crash.
For more information regarding AKS, see Azure Monitor features for Kubernetes monitoring.
Enabling Log Monitoring
You can enable log monitoring when creating a AKS cluster or after the cluster has been created.
Once logging is enabled, all your container stdout
and stderr
logs are collected by the infrastructure for you to analyze.
- While creating a AKS cluster, enable container monitoring. You can use can existing Log Analytics workspace or create a new one.
- In an existing AKS cluster where monitoring was not enabled during creation, go to Logs on the AKS cluster and enable it.
If you want to persist logs in the PVC, follow the instructions here.
Viewing Container Logs
To view all the container logs:
- Go to Monitoring > Logs.
- Use the filter option to see the logs from the containers that you are interested in.
Cluster Usage
Dremio displays the number of unique users who executed jobs on that day and the number of executed jobs.
-
Hover over the help icon in the left navigation bar.
-
Click on
About Dremio
in the menu. -
Click on the
Cluster Usage Data
tab.
System Telemetry
Dremio exposes system telemetry metrics in Prometheus format by default. It is not necessary to configure an exporter to collect the metrics. Instead, you can specify the host and port number where metrics are exposed in the dremio.conf file and scrape the metrics with any Prometheus-compliant tool.
To specify the host and port number where metrics are exposed, add these two properties to the dremio.conf file:
services.web-admin.host
: set to the desired host address (typically0.0.0.0
or the IP address of the host where Dremio is running).services.web-admin.port
: set to any desired value that is greater than1024
.
For example:
Example host and port settings in dremio.confservices.web-admin.host: "127.0.0.1"
services.web-admin.port: 9090
Restart Dremio after you update the dremio.conf file to make sure your changes take effect.
Access the exported Dremio system telemetry metrics at http://<yourHost>:<yourPort>/metrics
.
For more information about Prometheus metrics, read Types of Metrics in the Prometheus documentation.