SYS.PROJECT.HISTORY.JOBS

The sys.project.history.jobs table contains the historical metadata for the jobs in a project.

Syntax

SELECT *
FROM sys.project.history.jobs

note

Dremio refreshes the data in sys.project.history.jobs once per hour, so it usually takes about 1 hour for query information to appear. If a query is run soon after Dremio's hourly refresh and takes a long time to finish, it could take more than 1 hour for the query's data to appear in the sys.project.history.jobs table.

Example Output

job_id	status	query_type	user_name	queried_datasets	scanned_datasets	execution_cpu_time_ns	attempt_count	submitted_ts	attempt_started_ts	metadata_retrieval_ts	planning_start_ts	query_enqueued_ts	engine_start_ts	execution_planning_start_ts	execution_start_ts	final_state_ts	submitted_epoch	attempt_started_epoch	metadata_retrieval_epoch	planning_start_epoch	query_enqueued_epoch	engine_start_epoch	execution_planning_start_epoch	execution_start_epoch	final_state_epoch	planner_estimated_cost	rows_scanned	bytes_scanned	rows_returned	bytes_returned	accelerated	queue_name	engine	execution_nodes	memory_available	error_msg	query	query_chunks	reflection_matches	starting_ts	starting_ts_epoch
1e064d24-8d14-4b07-b55f-244a3dcc0700	COMPLETED	METADATA_REFRESH	$dremio$	null	null	50016617529	1	2022-02-01 22:23:22.977	2022-02-01 22:23:22.977	null	null	2022-02-01 22:25:36.504	null	2022-02-01 22:25:36.545	2022-02-01 22:25:38.431	2022-02-01 22:25:52.257	1643754202977	1643754202977	0	0	1643754336504	0	1643754336545	1643754338431	1643754352257	2.41905854E9	0	0	0	0	false	LARGE	First Engine	[{"node_id":"172.31.118.190","hostname":"172.31.118.190","max_mem_used":957160000},{"node_id":"172.31.127.16","hostname":"172.31.127.16","max_mem_used":963517000},{"node_id":"172.31.116.185","hostname":"172.31.116.185","max_mem_used":1208439000}]	2526309640	null	REFRESH DATASET "Samples"."samples.dremio.com"."NYC-taxi-trips"	null	null	2022-10-31 17:09:05.413	1667236145413000000
1e042853-bf92-9c56-c9ce-c09193c2c200	COMPLETED	UI_RUN	test.user@dremio.com	[{"name":"sys.project.history.jobs","type":"INVALID_DATASET_TYPE"}]	[{"name":""__history".sys.project.history.jobs","type":"PHYSICAL_DATASET"}]	1147159345	1	2022-02-03 13:24:59.474	2022-02-03 13:24:59.474	null	2022-02-03 13:25:02.600	2022-02-03 13:27:17.973	null	2022-02-03 13:27:17.989	2022-02-03 13:27:19.466	2022-02-03 13:27:31.955	1643894699474	1643894699474	0	1643894702600	1643894837973	0	1643894837989	1643894839466	1643894851955	685.0	12	14428	12	14428	false	SMALL	First Engine	[{"node_id":"172.31.114.190","hostname":"172.31.114.190","max_mem_used":11957000}]	4032436	null	select * from sys.project.history.jobs	null	null	2022-10-31 17:10:54.007	1667236254007000000

Fields

Field	Data Type	Description
job_id	varchar	The UUID to identify the job.
status	varchar	The state of the job. Enum: `CANCELED`, `COMPLETED`, `FAILED`
query_type	varchar	The mechanism used to submit the job. Enum: `ACCELERATOR_CREATE`, `ACCELERATOR_DROP`, `ACCELERATOR_EXPLAIN`, `FLIGHT`,`INTERNAL_ICEBERG_METADATA_DROP`, `JDBC`, `UI_EXPORT`, `UI_INTERNAL_PREVIEW`, `UI_INTERNAL_RUN`, `UI_PREVIEW`, `UI_RUN`, `METADATA_REFRESH`, `ODBC`, `PREPARE_INTERNAL`, `REST`, `UNKNOWN`
user_name	varchar	The username of the user who submitted the job. For jobs that are triggered by Dremio, this value is $dremio$ .
queried_datasets	[varchar]	An array representation of the fully-qualified dataset (table and view) names referenced by the job.
scanned_datasets	[varchar]	An array representation of the fully-qualified table names or Reflection IDs scanned during the process. Note: The scanned datasets are often different from the `queried_datasets`.
execution_cpu_time_ns	integer	The time (in microseconds) spent executing the job.
attempt_count	integer	The number of attempts that the job was attempted.
submitted_ts	timestamp	The date and time when the job was submitted to the system.
attempt_started_ts	timestamp	The date and time when latest attempt of the job was started. In most cases, this will be the same value as `submitted_ts`.
metadata_retrieval_ts	timestamp	The date and time when the metadata retrieval phase of the job started.
planning_start_ts	timestamp	The date and time when the planning phase of the job started.
query_enqueued_ts	timestamp	The date and time when the job was first submitted to the engine.
engine_start_ts	timestamp	The date and time when the engine replica start up was triggered.
execution_planning_start_ts	timestamp	The date and time when the execution planning phase of the job started.
execution_start_ts	timestamp	The date and time when the job started processing on the engine replicas.
final_state_ts	timestamp	The date and time when the job execution reached a final state (complete, failed, or canceled).
submitted_epoch	integer	The date and time (in epoch nanoseconds) when the job was submitted to the system.
attempt_started_epoch	integer	The date and time (in epoch nanoseconds) when this attempt started. In most cases, this will be the same value as `submitted_epoch`.
metadata_retrieval_epoch	integer	The date and time (in epoch nanoseconds) when the metadata retrieval phase of the job started.
planning_start_epoch	integer	The date and time (in epoch nanoseconds) when the planning phase of the job started.
query_enqueued_epoch	integer	The date and time (in epoch nanoseconds) when the job was first submitted to the engine.
engine_start_epoch	integer	The date and time (in epoch nanoseconds) when the engine replica start up was triggered.
execution_planning_start_epoch	integer	The date and time (in epoch nanoseconds) when the execution planning phase of the job started.
execution_start_epoch	integer	The date and time (in epoch nanoseconds) when the job started processing on the engine replicas.
final_state_epoch	integer	The date and time (in epoch nanoseconds) when the job execution reached a final state (complete, failed, or canceled).
planner_estimated_cost	double	The estimated cost value provided by the planner.
rows_scanned	integer	The sum of rows that were scanned by all the scans in the job.
bytes_scanned	integer	The sum of the bytes that were scanned by all the scans in the job.
rows_returned	integer	The number of rows returned by the job.
bytes_returned	integer	The number of bytes returned by the job results.
accelerated	boolean	This value will be `true` if the job was accelerated by a Reflection; `false` otherwise.
queue_name	varchar	This column does not apply to Dremio Cloud.
engine	varchar	The name of the engine that the job is executing on.
execution_nodes	[ struct ]	An array representation of the nodes in each replica that is up.
memory_available	integer	The memory allocated to the job.
error_msg	varchar	This value will contain the error message that resulted from the job if the job failed; "" otherwise.
query	varchar	The SQL query that was submitted, limited to 32KB. If the query exceeds 32KB, the rest of the query is stored in the `query_chunks` column.
query_chunks	[ varchar ]	An array representation of the query broken up into 32KB chunks. The number of elements in this array is limited to 128. If the query text larger than 4MB, the query text beyond 128 is ignored.
reflection_matches	[ struct ]	An array of Reflections that were evaluated for this job. The following fields are part of each array element: { "reflection_id": [The UUID of the reflection], "reflection_type": [RAW, AGGREGATE, or EXTERNAL], "dataset_id": [The UUID of the dataset that the Reflection is on.], "dataset_name": [The name of the dataset that the Reflection is on.], "materialization_id": [The UUID to identify the materialization], "chosen" - [`true` if this materialization was chosen for this job; `false` otherwise.]
starting_ts	timestamp	The date and time when the job entered the STARTING state.
starting_epoch	bigint	The date and time (in epoch nanoseconds) when the job entered the STARTING state.
is_profile_incomplete	boolean	If the job has incomplete telemetry data, `true`. Otherwise, `false`.

Example Output​

Fields​

Example Output

Fields