This topic describes job profiles.
A profile is a summary of metrics collected for each executed query. Query profiles provide information that can be used to monitor and analyze query performance. These profile metrics are available via the Dremio UI Jobs > Profile.
Within the Dremio UI Jobs Profile tab, you can display information based on the following views:
- Visualized Plan
- Error (if applicable)
Each view displays the following metrics information: (Query, Visualized Plan, Planning, and Acceleration).
- Resource Allocation
The Query view shows the selected query statement, a job summary, and job metrics. The job summary information includes:
- Query Text
- Job State
- Command Pool Wait Time
- Planning Time
- Resource Scheduling Time
- Name of Coordinator
The following table describes each job state:
|Wait to be scheduled by the command pool, the pool of threads available to the coordinator.|
|Parse the SQL command, check permissions, and retrieve schema information from the KVStore.|
|Physical and logical planning, Match data reflections and substituions, prune partitions, and map the query to a queue using WLM rules.|
|Wait for engine start. Currently applies only to AWS Edition deployments.|
|Each queue has a limited number of concurrent jobs. Queries wait for an in-progress job to complete when the number of in-progress jobs is greater than the limit.|
|Select executor nodes to run the query, retrieve split metadata from the KVStore for the pruned partitions, parallelize the fragments, assign fragments among the executor nodes, and assign splits to fragments taking into account data locality.|
|Send RPCs to each executor that contains information about the fragments assigned to it.|
|Wait for executor nodes to execute and complete the fragments assigned to them. Typically, queries spend most of their time in this job state.|
|Query successfully completed.|
|Query was cancelled either by a user or an internal issue, such as insufficient memory or heap.|
|Query failed due to an error.|
The Visualized Plan view shows a visualized diagram and a job summary along with job metrics information. The Job profile visualized plan is useful in understanding the flow of the query.
The detailed visualized pan diagram is always read from the bottom up.
The Planning view shows planning metrics, query output schema, non default options, and a job summary along with job metrics information.
The Planning view provides statistics about the actual cost of the query operations in terms of memory, I/O, and CPU processing. You can use this profile to identify which operations consumed the majority of the resources during a query, modify the physical plan to address the cost-intensive operations. In particular, the following information is useful:
- Non Default Options
- Metadata Cache Hits and Misses with times
- Final Physical Transformation - For example, look for push down query for RDBMS, MongoDB, or Elasticsearch, filter pushdowns or partition pruning for Parquet, the usage of stripes for ORC and so on.
- Compare estimated row count versus the actual scan, join, or aggregate result.
The Acceleration view shows reflection outcome, canonicalized user query alternatives, reflection details, and a job summary along with job metrics information.
The following considerations determines the acceleration process:
- Considered, Matched, Chosen – The query is accelerated.
- Considered, Matched, Not Chosen – The query is not accelerated because either a costing issue or an exception during substitution occurred.
- Considered, Not Matched, Not Chosen – The query is not accelerated because the reflection does not have the data to accelerate.
The Error view shows information about an errors.
- Failure Node – This node is always the coordinator node.
- Server name inside the error message is the actual affected node