On this page

    Managing Elastic Engines

    Multiple AWS Engines

    The AWS Edition of Dremio supports the ability to provision multiple separate execution engines from a single Dremio coordinator node, dynamically schedule execution engines to run independently at different times and automatically start and stop based on workload requirements at runtime. This provides several benefits, including:

    • Workloads are isolated within their own set of CPU, Memory & C3 resources and not impacted by other workloads
    • Time-sensitive yet resource intensive workloads (nightly jobs, reflections refreshes, etc.) can be provisioned with the appropriate amount of resources to complete on time, but remain cost effective by only running when required
    • Track cost by team by running workloads on their own resources
    • Right size execution resources for each distinct workload, instead of implementing a one sized fits model
    • Easily experiment with different execution resource sizes, at any scale
    Navigate to the Engines screen to view all current engines.

    Creating AWS Engines

    Create and configure new execution engines for AWS Edition by navigating to the Elastic Engines page and clicking New Engine.

    • Engine Name - The name for the engine
    • Engine Size (Nodes) - The number of nodes in the Dremio cluster. Choose Custom to override the default instance type for for each selection.
    • Number of Nodes - The number of nodes in the engine. It only appears when Custom is selected for Engine Size (Nodes).
    • Engine Node Type - The EC2 instance type of the nodes in the engine. It only appears when Custom is selected for Engine Size (Nodes). The instance types are:
      • m5d.2xlarge (8c/32gb) - Evaluation
      • m5d.8xlarge (32c/128gb) - Standard (default)
      • r5d.4xlarge (16c/128gb) - High memory per unit of compute
      • c5d.18xlarge (72c/144gb) - High CPU
      • i3.4xlarge (16c/122gb) - Higher local storage for caching
    • Advanced Options - See “Advanced Configuration” below

    After specifying the Engine Name, Instance Type and Node Count, create and start the execution engine by selecting Save & Launch, which:

    1. Saves the execution engines’s configuration
    2. Starts the execution engine and provisions EC2 nodes

    Typically execution engines complete AWS EC2 provisioning and startup in less than one minute, however timeframes can vary based on instance availability in EC2.

    After launching an execution engine, new EC2 instances are provisioned as executor nodes. The state for each node can be monitored on the Provisioning page, nodes start in the Pending state while waiting for AWS EC2 to identify resources, transition to the Provisioning state during startup, and finally transition to Active once the EC2 node and Dremio Software are fully available on the node.

    The execution engine is available for queries once at least 70% of the nodes are available. If that percentage is not available after 5 minutes Dremio cancels the operation and terminates the engine, including any launched nodes.

    Monitoring Engines

    Click an engine on the Engines page to display a detailed status of the engine or to stop the engine. Click Edit Settings to reconfigure the engine.

    drawing

    The Node Activity page shows each coordinator and executor node, the engine associated with the node, and the Stop Project button.

    Stopping Engines

    Running engines are stopped by selecting Stop on the Elastic Engines page. Stopping an engine immediately cancels all active queries on the engine and deletes all resources provisioned for the engine, this includes EC2 instances, EBS volumes, etc.

    Starting Engines

    Start a stopped engine by clicking Start on the Elastic Engines page. Starting an engine initiates the same process to launch the engine as when the engine was first created and started with Save & Launch described above.

    Routing Queries

    There are two methods available to control which engine queries run on.

    1. Workload Management based Routing
    2. Direct Routing

    Workload Management based Routing

    Workload Management can control the engine each query runs on by defining Rules processed during runtime that determine the Queue for each individual query. In deployments with only one engine, all Queues share the same execution resources and route to the same single engine. However, when provisioning multiple engines, each Queue can be scheduled on a different engine.

    To route queries to a specific engine:

    1. Configure WLM Rules to route a set of queries to a specific Queue
    2. Goto Edit Queue and change the Queue’s properties and set the Engine Name to desired engine

    The following example demonstrates routing high cost reflection rebuild queries to a specific engine just for large reflection rebuilds.

    The default Engine Name for a queue is Any which allows queries in that WLM Queue to run on any engine available at the time of execution. Once a specific Engine Name is selected, queries in that Queue will only run on the selected engine. If the engine is not available the query will fail with the error message Error: No executors are available even if other engines are currently active.

    Note: Configurations that either have more than one engine or use the Auto Start feature should specify an Engine Name for all WLM Queues and make sure than no WLM Queues route to ‘Any’. ‘Any’ is intended only for configurations that do not make use of multiple engines and for backward compatibility. Queues that route to ‘Any’ will not automatically start an engine and can also route queries to engines that are stopped.

    Direct Routing

    Direct Routing is used to specify the exact Queue and engine to run queries on for a given ODBC or JDBC session. With Direct Routing WLM Rules are not considered and instead queries are routed directly to the specified Queue. Clients can be configured so that all queries run on a specific engines or queries run on different engines on a per-session basis.

    To use Direct Routing add the Connection property ROUTING_QUEUE = <WLM Queue Name> to the ODBC or JDBC session parameters when connecting to Dremio. When set all queries for the session are automatically routed to the specified WLM Queue and the engine selected for that Queue.

    To disable Tag Routing set the Dremio support key dremio.wlm.direct_routing to false. By default Direct Routing is enabled.

    Workload management provides details on how to configure ODBC and JDBC connections for Direct Routing

    Automatic Engine Management

    Dremio supports automatic management of engines and can automatically both start or stop engines based on workload requirements. Automatic Engine Management eliminates the need to manually provision and terminate engines and helps users cost optimize Drmeio by only running resources when workloads require them.

    Auto Start

    Auto Start instructs Dremio to automatically provision and start an engine when new queries are issued to that engine. When Auto Start is enabled in the engine’s Properties page, Dremio automatically starts the engine under the following conditions:

    1. The engine is selected as the queue’s Engine Name
    2. The execution engine is currently stopped

    Engines are only automatically started if they are selected as the Engine Name for a queue. If the Engine Name for a queue is Any and no engines are running, the query will fail with Error: No executors are available instead of selecting a random engine to start.

    Auto Stop

    Auto Stop instructs Dremio to automatically stop an engine after a period of inactivity. When you select a time period for Auto Stop in the engine’s Properties page, all executor nodes associated with the engine will stop automatically if no queries are issued to the engine for the selected time period. The default timeout for Auto Stop is two hours. To disable Auto Stop, select the Disable Auto Stop option.

    Advanced Configuration

    By default, the Set Up Elastic Engine dialog automatically selects the security group ID and EC2 key pair used to launch the existing engine.

    Engine Properties

    Additional engine properties to configure are:

    Option Description
    Name Name of the engine
    Instance Type AWS Instance Type used for engine nodes. The supported types are:
    • m5d.2xlarge (8c/32gb) - General Purpose
    • m5d.8xlarge (32c/128gb) - General Purpose (default)
    • m5ad.8xlarge (32c/128gb) - General Purpose
    • r5d.4xlarge (16c/128gb) - Higher memory per unit of compute
    • c5d.18xlarge (72c/144gb) - Higher compute
    • i3.4xlarge (16c/122gb) - Higher local storage for caching
    Instance Count Number of execution nodes
    EC2 Key Pair Name AWS Key Pair used to log onto server instances
    Security Group ID The Group ID of the Security Group to use
    IAM Role for S3 Access IAM Role used to access S3 buckets for Distributed Storage (optional)
    Use Clustered Placement Whether or not to use placement groups which locates nodes closer together. It is recommended to enable this option but can take longer for AWS to identify resources with larger engines
    Enable Auto Start Automatically start the engine when new queries are submitted to the engine
    Enable Auto Stop Automatically stop the engine after 5 minutes of inactivity
    AWS API Authentication Mode The authentication mode used to provision EC2 nodes and stop EC2 nodes. Auto uses an IAM Role to management execution nodes. Alternatively an AWS Access and Secret Key can be used
    Access Key The AWS Access Key if using Key/Secret authentication
    Secret The AWS Secret Key if using Key/Secret authentication
    IAM Role for API Operations The IAM Role to assume to provision and manage the engine nodes, by default the IAM Role of the coordinator node is used.
    Extra Dremio Configuration Properties Additional configuration options

    Additional

    Additional behaviors to be aware of:

    • When an engine stops the EC2 nodes are terminated and deleted. As a result there are no additional expenses incurred when an engine is stopped, however logs on the executor nodes are not maintained after the engine is stopped.
    • Auto Stop only stops an engine if there is at least one WLM Queue that connects to the engine and routes queries to the engine. Engines not configured within WLM will not automatically stop.
    • Refreshing the node activity page could cause a engine to start
    • If an engine is in the process of stopping, a new query will not cause it to auto start and instead the query will be canceled with an error
    • Stopping the coordinator node does not stop execution engines, execution engines should be stopped prior to shutting down the coordinator node