This topic describes how to analyze Job Profiles to determine resource issues and tune performance.
Job profile data is available via the Dremio UI Jobs tab. You select an individual job in the left pane and then select the Profile tab in the right pane. The associated profile pop-up shows a variety of information for review. The UI-generated profile data is a sub-set of the data that you can download.
To use SQL to analyze profile data:
- Download the profile and add it to Dremio as a dataset. See Downloading Profiles and Uploading Profile Data.
- Use SQL to slice and dice the profile data. See Querying Profile Datasets
Reviewing Query Data
The Job profile query section is useful for identifying initial problem areas.
When reviewing the job profile query information, check the following:
- SQL query -- See if your SQL query is what you were expecting.
- Query source -- See if the query is run against the source data.
- Planning time -- See if the time is excessive
- Non-default parameters -- See if non-default parameters are being used.
Reviewing Visualized Plan Data
The Job profile visualized plan section is useful in understanding the flow of the query. It is useful for analyzing out of memory issues and incorrect results.
[info] The query flow diagram is always read from the bottom up.
Reviewing Planning Data
The Job profile planning section is useful to determing how query planning is executed.
It is useful to enable
planner.verbose_profile to obtain more data if there is a planning issue.
When reviewing the job profile planning information, check the following:
- Row count -- See if row count (versus rows) is used. Row count can cause an expensive broadcast.
- Build -- See if build (versus probe) is used. Build loads data into memory.
Reviewing Acceleration Data
The Job profile acceleration is useful for determining whether exceptions or matches are occurring.
It is useful to enable
planner.verbose_profile to obtain more data if there is a substitution issue.
When reviewing the job profile acceleration information, check the following:
- Reflections -- See if the following occur:
- Reflection criteria: considered, not matched, not chosen (exceptions or genuine usually are not matched)
- Reflection criteria: considered, matched, chosen
- Reflection criteria: considered, matched, not chosen (exceptions or cost usually are not chosen)
- Multiple substitutions -- See if the substitutions are excessive. The verbose profile will provide more informatio.
- System activity -- See if
- Comparisions -- Compare cumulative cost against logical planning (
- Cumulative cost is found in Acceleration > Best cost Replacement Plan.
- Logical planning is found in Planning.
A major consideration is whether a thread is in a running, blocked and waiting for data (or to send data), or sleeping state. A thread is usually sleeping if the thread is ready to run but another thread is currently using the CPU.
A thread is usually in a blocked state for one of the following reasons:
- It's waiting on some data from another thread (a child phase in the tree).
- It's trying to send data to another thread (a parent phase) but the receiving thread isn't responding. The receiving thread may be too slow or overwhelmed with work. What occurs is that the sending thread is forced to block until the receiver is able to receive the data.
Network activities are tracked as part of the blocked metric.
- If a thread is blocked, nothing much can be done.
- If a thread is sleeping, ensure that
task.on_idle_load_shedis set to true.
The following general areas should be reviewed:
To troubleshoot errors, consider the following:
- Dremio nodes -- Determine whether the coordinator or executor nodes are impacted.
- Verbose error
- Out of memory
- Incorrect results
- No results
To troubleshoot performance, consider the following:
- Planning time versus execution time
- Number of threads which impact process time
- Row count versus rows
- Blocked versus sleep
- Setup time version wait time
- Operator metrics (contact firstname.lastname@example.org)
Downloaded Profile Data
After downloading your jobs profile data, the following files provide valuable information.
- header.json -- This file provides the full list of Dremio coordinators and executors, data sets, and sources. This information is useful when you are using REST calls.
- profile_attempt_0.json -- This file helps with troubleshooting out of memory and wrong results issues. Note that the start and end time of query is provided in EPOCH format. See the Epoch Converter utility for converting query time.
For More Information
- Example: Number of Rows
- Example: Amount of Consumed Memory
- Example: Operator Type Mapping
- Example: profile_attempt_0