SYS.PROJECT.HISTORY.JOBS
The sys.project.history.jobs
table contains the historical metadata for the jobs in a project.
SELECT *
FROM sys.project.history.jobs
note
Dremio refreshes the data in sys.project.history.jobs
once per hour, so it usually takes about 1 hour for query information to appear. If a query is run soon after Dremio's hourly refresh and takes a long time to finish, it could take more than 1 hour for the query's data to appear in the sys.project.history.jobs
table.
Example Output
job_id | status | query_type | user_name | queried_datasets | scanned_datasets | execution_cpu_time_ns | attempt_count | submitted_ts | attempt_started_ts | metadata_retrieval_ts | planning_start_ts | query_enqueued_ts | engine_start_ts | execution_planning_start_ts | execution_start_ts | final_state_ts | submitted_epoch | attempt_started_epoch | metadata_retrieval_epoch | planning_start_epoch | query_enqueued_epoch | engine_start_epoch | execution_planning_start_epoch | execution_start_epoch | final_state_epoch | planner_estimated_cost | rows_scanned | bytes_scanned | rows_returned | bytes_returned | accelerated | queue_name | engine | execution_nodes | memory_available | error_msg | query | query_chunks | reflection_matches | starting_ts | starting_ts_epoch |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1e064d24-8d14-4b07-b55f-244a3dcc0700 | COMPLETED | METADATA_REFRESH | $dremio$ | null | null | 50016617529 | 1 | 2022-02-01 22:23:22.977 | 2022-02-01 22:23:22.977 | null | null | 2022-02-01 22:25:36.504 | null | 2022-02-01 22:25:36.545 | 2022-02-01 22:25:38.431 | 2022-02-01 22:25:52.257 | 1643754202977 | 1643754202977 | 0 | 0 | 1643754336504 | 0 | 1643754336545 | 1643754338431 | 1643754352257 | 2.41905854E9 | 0 | 0 | 0 | 0 | false | LARGE | First Engine | [{"node_id":"172.31.118.190","hostname":"172.31.118.190","max_mem_used":957160000},{"node_id":"172.31.127.16","hostname":"172.31.127.16","max_mem_used":963517000},{"node_id":"172.31.116.185","hostname":"172.31.116.185","max_mem_used":1208439000}] | 2526309640 | null | REFRESH DATASET "Samples"."samples.dremio.com"."NYC-taxi-trips" | null | null | 2022-10-31 17:09:05.413 | 1667236145413000000 |
1e042853-bf92-9c56-c9ce-c09193c2c200 | COMPLETED | UI_RUN | test.user@dremio.com | [{"name":"sys.project.history.jobs","type":"INVALID_DATASET_TYPE"}] | [{"name":""__history".sys.project.history.jobs","type":"PHYSICAL_DATASET"}] | 1147159345 | 1 | 2022-02-03 13:24:59.474 | 2022-02-03 13:24:59.474 | null | 2022-02-03 13:25:02.600 | 2022-02-03 13:27:17.973 | null | 2022-02-03 13:27:17.989 | 2022-02-03 13:27:19.466 | 2022-02-03 13:27:31.955 | 1643894699474 | 1643894699474 | 0 | 1643894702600 | 1643894837973 | 0 | 1643894837989 | 1643894839466 | 1643894851955 | 685.0 | 12 | 14428 | 12 | 14428 | false | SMALL | First Engine | [{"node_id":"172.31.114.190","hostname":"172.31.114.190","max_mem_used":11957000}] | 4032436 | null | select * from sys.project.history.jobs | null | null | 2022-10-31 17:10:54.007 | 1667236254007000000 |
Fields
Field | Data Type | Description |
---|---|---|
job_id | varchar | The UUID to identify the job. |
status | varchar | The state of the job. Enum: CANCELED , COMPLETED , FAILED |
query_type | varchar | The mechanism used to submit the job. Enum: ACCELERATOR_CREATE , ACCELERATOR_DROP , ACCELERATOR_EXPLAIN , FLIGHT ,INTERNAL_ICEBERG_METADATA_DROP , JDBC , UI_EXPORT , UI_INTERNAL_PREVIEW , UI_INTERNAL_RUN , UI_PREVIEW , UI_RUN , METADATA_REFRESH , ODBC , PREPARE_INTERNAL , REST , UNKNOWN |
user_name | varchar | The username of the user who submitted the job. For jobs that are triggered by Dremio, this value is $dremio$ . |
queried_datasets | [varchar] | An array representation of the fully-qualified dataset (table and view) names referenced by the job. |
scanned_datasets | [varchar] | An array representation of the fully-qualified table names or reflection IDs scanned during the process. Note: The scanned datasets are often different from the queried_datasets . |
execution_cpu_time_ns | integer | The time (in microseconds) spent executing the job. |
attempt_count | integer | The number of attempts that the job was attempted. |
submitted_ts | timestamp | The date and time when the job was submitted to the system. |
attempt_started_ts | timestamp | The date and time when latest attempt of the job was started. In most cases, this will be the same value as submitted_ts . |
metadata_retrieval_ts | timestamp | The date and time when the metadata retrieval phase of the job started. |
planning_start_ts | timestamp | The date and time when the planning phase of the job started. |
query_enqueued_ts | timestamp | The date and time when the job was first submitted to the engine. |
engine_start_ts | timestamp | The date and time when the engine replica start up was triggered. |
execution_planning_start_ts | timestamp | The date and time when the execution planning phase of the job started. |
execution_start_ts | timestamp | The date and time when the job started processing on the engine replicas. |
final_state_ts | timestamp | The date and time when the job execution reached a final state (complete, failed, or canceled). |
submitted_epoch | integer | The date and time (in epoch nanoseconds) when the job was submitted to the system. |
attempt_started_epoch | integer | The date and time (in epoch nanoseconds) when this attempt started. In most cases, this will be the same value as submitted_epoch . |
metadata_retrieval_epoch | integer | The date and time (in epoch nanoseconds) when the metadata retrieval phase of the job started. |
planning_start_epoch | integer | The date and time (in epoch nanoseconds) when the planning phase of the job started. |
query_enqueued_epoch | integer | The date and time (in epoch nanoseconds) when the job was first submitted to the engine. |
engine_start_epoch | integer | The date and time (in epoch nanoseconds) when the engine replica start up was triggered. |
execution_planning_start_epoch | integer | The date and time (in epoch nanoseconds) when the execution planning phase of the job started. |
execution_start_epoch | integer | The date and time (in epoch nanoseconds) when the job started processing on the engine replicas. |
final_state_epoch | integer | The date and time (in epoch nanoseconds) when the job execution reached a final state (complete, failed, or canceled). |
planner_estimated_cost | double | The estimated cost value provided by the planner. |
rows_scanned | integer | The sum of rows that were scanned by all the scans in the job. |
bytes_scanned | integer | The sum of the bytes that were scanned by all the scans in the job. |
rows_returned | integer | The number of rows returned by the job. |
bytes_returned | integer | The number of bytes returned by the job results. |
accelerated | boolean | This value will be true if the job was accelerated by a reflection; false otherwise. |
queue_name | varchar | This column does not apply to Dremio Cloud. |
engine | varchar | The name of the engine that the job is executing on. |
execution_nodes | [ struct ] | An array representation of the nodes in each replica that is up. |
memory_available | integer | The memory allocated to the job. |
error_msg | varchar | This value will contain the error message that resulted from the job if the job failed; "" otherwise. |
query | varchar | The SQL query that was submitted, limited to 32KB. If the query exceeds 32KB, the rest of the query is stored in the query_chunks column. |
query_chunks | [ varchar ] | An array representation of the query broken up into 32KB chunks. The number of elements in this array is limited to 128. If the query text larger than 4MB, the query text beyond 128 is ignored. |
reflection_matches | [ struct ] | An array of reflections that were evaluated for this job. The following fields are part of each array element: { "reflection_id": [The UUID of the reflection], "reflection_type": [RAW, AGGREGATE, or EXTERNAL], "dataset_id": [The UUID of the dataset that the reflection is on.], "dataset_name": [The name of the dataset that the reflection is on.], "materialization_id": [The UUID to identify the materialization], "chosen" - [ true if this materialization was chosen for this job; false otherwise.] |
starting_ts | timestamp | The date and time when the job entered the STARTING state. |
starting_epoch | bigint | The date and time (in epoch nanoseconds) when the job entered the STARTING state. |
is_profile_incomplete | boolean | If the job has incomplete telemetry data, true . Otherwise, false . |