On this page

    Job Profile

    This topic describes job profiles.

    A profile is a summary of metrics collected for each executed query. Query profiles provide information that can be used to monitor and analyze query performance. These profile metrics are available via the Dremio UI Jobs > Profile.

    Within the Dremio UI Jobs Profile tab, you can display information based on the following views:

    • Query
    • Visualized Plan
    • Planning
    • Acceleration
    • Error (if applicable)

    Job Metrics

    Each view displays the following metrics information: (Query, Visualized Plan, Planning, and Acceleration).

    • Threads
    • Resource Allocation
    • Nodes
    • Operators

    Query

    The Query view shows the selected query statement, a job summary, and job metrics. The job summary information includes:

    • Query Text
    • Job State
    • Command Pool Wait Time
    • Planning Time
    • Resource Scheduling Time
    • Name of Coordinator

    The following table describes each job state:

    Job StateDescription
    PENDINGWait to be scheduled by the command pool, the pool of threads available to the coordinator.
    METADATA_RETRIEVALParse the SQL command, check permissions, and retrieve schema information from the KVStore.
    PLANNINGPhysical and logical planning, Match data reflections and substituions, prune partitions, and map the query to a queue using WLM rules.
    ENGINE_STARTWait for engine start. Currently applies only to AWS Edition deployments.
    QUEUEDEach queue has a limited number of concurrent jobs. Queries wait for an in-progress job to complete when the number of in-progress jobs is greater than the limit.
    EXECUTION PLANNINGSelect executor nodes to run the query, retrieve split metadata from the KVStore for the pruned partitions, parallelize the fragments, assign fragments among the executor nodes, and assign splits to fragments taking into account data locality.
    STARTINGSend RPCs to each executor that contains information about the fragments assigned to it.
    RUNNINGWait for executor nodes to execute and complete the fragments assigned to them. Typically, queries spend most of their time in this job state.
    COMPLETEDQuery successfully completed.
    CANCELLEDQuery was cancelled either by a user or an internal issue, such as insufficient memory or heap.
    FAILEDQuery failed due to an error.

    Visualized Plan

    The Visualized Plan view shows a visualized diagram and a job summary along with job metrics information. The Job profile visualized plan is useful in understanding the flow of the query.

    Note:
    The detailed visualized pan diagram is always read from the bottom up.

    Planning

    The Planning view shows planning metrics, query output schema, non default options, and a job summary along with job metrics information.

    The Planning view provides statistics about the actual cost of the query operations in terms of memory, I/O, and CPU processing. You can use this profile to identify which operations consumed the majority of the resources during a query, modify the physical plan to address the cost-intensive operations. In particular, the following information is useful:

    • Non Default Options
    • Metadata Cache Hits and Misses with times
    • Final Physical Transformation - For example, look for push down query for RDBMS, MongoDB, or Elasticsearch, filter pushdowns or partition pruning for Parquet, the usage of stripes for ORC and so on.
    • Compare estimated row count versus the actual scan, join, or aggregate result.

    Acceleration

    The Acceleration view shows reflection outcome, canonicalized user query alternatives, reflection details, and a job summary along with job metrics information.

    The following considerations determines the acceleration process:

    • Considered, Matched, Chosen – The query is accelerated.
    • Considered, Matched, Not Chosen – The query is not accelerated because either a costing issue or an exception during substitution occurred.
    • Considered, Not Matched, Not Chosen – The query is not accelerated because the reflection does not have the data to accelerate.

    Error

    The Error view shows information about an errors.

    • Failure Node – This node is always the coordinator node.
    • Server name inside the error message is the actual affected node

    For More Information