Skip to main content

SYS.PROJECT.HISTORY.JOBS

The sys.project.history.jobs table contains the historical metadata for the jobs in a project.

Syntax
SELECT *
FROM sys.project.history.jobs
note

Dremio refreshes the data in sys.project.history.jobs once per hour, so it usually takes about 1 hour for query information to appear. If a query is run soon after Dremio's hourly refresh and takes a long time to finish, it could take more than 1 hour for the query's data to appear in the sys.project.history.jobs table.

Example Output

job_idstatusquery_typeuser_namequeried_datasetsscanned_datasetsexecution_cpu_time_nsattempt_countsubmitted_tsattempt_started_tsmetadata_retrieval_tsplanning_start_tsquery_enqueued_tsengine_start_tsexecution_planning_start_tsexecution_start_tsfinal_state_tssubmitted_epochattempt_started_epochmetadata_retrieval_epochplanning_start_epochquery_enqueued_epochengine_start_epochexecution_planning_start_epochexecution_start_epochfinal_state_epochplanner_estimated_costrows_scannedbytes_scannedrows_returnedbytes_returnedacceleratedqueue_nameengineexecution_nodesmemory_availableerror_msgqueryquery_chunksreflection_matchesstarting_tsstarting_ts_epoch
1e064d24-8d14-4b07-b55f-244a3dcc0700COMPLETEDMETADATA_REFRESH$dremio$nullnull5001661752912022-02-01 22:23:22.9772022-02-01 22:23:22.977nullnull2022-02-01 22:25:36.504null2022-02-01 22:25:36.5452022-02-01 22:25:38.4312022-02-01 22:25:52.2571643754202977164375420297700164375433650401643754336545164375433843116437543522572.41905854E90000falseLARGEFirst Engine[{"node_id":"172.31.118.190","hostname":"172.31.118.190","max_mem_used":957160000},{"node_id":"172.31.127.16","hostname":"172.31.127.16","max_mem_used":963517000},{"node_id":"172.31.116.185","hostname":"172.31.116.185","max_mem_used":1208439000}]2526309640nullREFRESH DATASET "Samples"."samples.dremio.com"."NYC-taxi-trips"nullnull2022-10-31 17:09:05.4131667236145413000000
1e042853-bf92-9c56-c9ce-c09193c2c200COMPLETEDUI_RUNtest.user@dremio.com[{"name":"sys.project.history.jobs","type":"INVALID_DATASET_TYPE"}][{"name":"\"__history\".sys.project.history.jobs","type":"PHYSICAL_DATASET"}]114715934512022-02-03 13:24:59.4742022-02-03 13:24:59.474null2022-02-03 13:25:02.6002022-02-03 13:27:17.973null2022-02-03 13:27:17.9892022-02-03 13:27:19.4662022-02-03 13:27:31.955164389469947416438946994740164389470260016438948379730164389483798916438948394661643894851955685.012144281214428falseSMALLFirst Engine[{"node_id":"172.31.114.190","hostname":"172.31.114.190","max_mem_used":11957000}]4032436nullselect * from sys.project.history.jobsnullnull2022-10-31 17:10:54.0071667236254007000000

Fields

FieldData TypeDescription
job_idvarcharThe UUID to identify the job.
statusvarcharThe state of the job.
Enum: CANCELED, COMPLETED, FAILED
query_typevarcharThe mechanism used to submit the job.
Enum: ACCELERATOR_CREATE, ACCELERATOR_DROP, ACCELERATOR_EXPLAIN, FLIGHT,INTERNAL_ICEBERG_METADATA_DROP, JDBC, UI_EXPORT, UI_INTERNAL_PREVIEW, UI_INTERNAL_RUN, UI_PREVIEW, UI_RUN, METADATA_REFRESH, ODBC, PREPARE_INTERNAL, REST, UNKNOWN
user_namevarcharThe username of the user who submitted the job. For jobs that are triggered by Dremio, this value is $dremio$.
queried_datasets[varchar]An array representation of the fully-qualified dataset (table and view) names referenced by the job.
scanned_datasets[varchar]An array representation of the fully-qualified table names or reflection IDs scanned during the process.
Note: The scanned datasets are often different from the queried_datasets.
execution_cpu_time_nsintegerThe time (in microseconds) spent executing the job.
attempt_countintegerThe number of attempts that the job was attempted.
submitted_tstimestampThe date and time when the job was submitted to the system.
attempt_started_tstimestampThe date and time when latest attempt of the job was started. In most cases, this will be the same value as submitted_ts.
metadata_retrieval_tstimestampThe date and time when the metadata retrieval phase of the job started.
planning_start_tstimestampThe date and time when the planning phase of the job started.
query_enqueued_tstimestampThe date and time when the job was first submitted to the engine.
engine_start_tstimestampThe date and time when the engine replica start up was triggered.
execution_planning_start_tstimestampThe date and time when the execution planning phase of the job started.
execution_start_tstimestampThe date and time when the job started processing on the engine replicas.
final_state_tstimestampThe date and time when the job execution reached a final state (complete, failed, or canceled).
submitted_epochintegerThe date and time (in epoch nanoseconds) when the job was submitted to the system.
attempt_started_epochintegerThe date and time (in epoch nanoseconds) when this attempt started. In most cases, this will be the same value as submitted_epoch.
metadata_retrieval_epochintegerThe date and time (in epoch nanoseconds) when the metadata retrieval phase of the job started.
planning_start_epochintegerThe date and time (in epoch nanoseconds) when the planning phase of the job started.
query_enqueued_epochintegerThe date and time (in epoch nanoseconds) when the job was first submitted to the engine.
engine_start_epochintegerThe date and time (in epoch nanoseconds) when the engine replica start up was triggered.
execution_planning_start_epochintegerThe date and time (in epoch nanoseconds) when the execution planning phase of the job started.
execution_start_epochintegerThe date and time (in epoch nanoseconds) when the job started processing on the engine replicas.
final_state_epochintegerThe date and time (in epoch nanoseconds) when the job execution reached a final state (complete, failed, or canceled).
planner_estimated_costdoubleThe estimated cost value provided by the planner.
rows_scannedintegerThe sum of rows that were scanned by all the scans in the job.
bytes_scannedintegerThe sum of the bytes that were scanned by all the scans in the job.
rows_returnedintegerThe number of rows returned by the job.
bytes_returnedintegerThe number of bytes returned by the job results.
acceleratedbooleanThis value will be true if the job was accelerated by a reflection; false otherwise.
queue_namevarcharThis column does not apply to Dremio Cloud.
enginevarcharThe name of the engine that the job is executing on.
execution_nodes[ struct ]An array representation of the nodes in each replica that is up.
memory_availableintegerThe memory allocated to the job.
error_msgvarcharThis value will contain the error message that resulted from the job if the job failed; "" otherwise.
queryvarcharThe SQL query that was submitted, limited to 32KB. If the query exceeds 32KB, the rest of the query is stored in the query_chunks column.
query_chunks[ varchar ]An array representation of the query broken up into 32KB chunks. The number of elements in this array is limited to 128. If the query text larger than 4MB, the query text beyond 128 is ignored.
reflection_matches[ struct ]An array of reflections that were evaluated for this job. The following fields are part of each array element:
{ "reflection_id": [The UUID of the reflection], "reflection_type": [RAW, AGGREGATE, or EXTERNAL], "dataset_id": [The UUID of the dataset that the reflection is on.], "dataset_name": [The name of the dataset that the reflection is on.], "materialization_id": [The UUID to identify the materialization], "chosen" - [true if this materialization was chosen for this job; false otherwise.]
starting_tstimestampThe date and time when the job entered the STARTING state.
starting_ts_epochbigintThe date and time (in epoch nanoseconds) when the job entered the STARTING state.