Skip to main content
Version: 24.3.x

Getting Started with YARN

Dremio's YARN quickstart provides details about how to install a Dremio YARN deployment using an RPM or Tarball.

tip

Review the System Requirements before you get started.

Install the Coordinator Node

You can install the coordinator using an RPM or Tarball.

Installing via RPM

  1. Download the RPM from Dremio Downloads.

  2. Open a terminal and go to the directory where you saved the RPM.

  3. Install using yum.

sudo yum install dremio-community-LATEST.noarch.rpm
  1. Start Dremio.
sudo service dremio start
note

Optionally, when running Dremio on a single laptop/machine that may experience network or IP changes, the following line can be added to dremio.conf under /etc/dremio/ to allow Dremio to keep working: registration.publish-host: "localhost"

Installing via Tarball

  1. Download the Tarball from Dremio Downloads.

  2. Open a terminal and go to the directory where you saved the Tarball.

  3. Extract the Tarball.

tar -xf dremio-community-LATEST.tar.gz
  1. Navigate to the bin folder.
cd dremio-comunity-<version>/bin
  1. Start Dremio
./dremio start

Set up YARN

  1. Open the Dremio console at http://localhost:9047.

  2. Click Settings in the left sidebar, click Engines, and then click Add Engine.

  1. Configure the available options according to the following:

    • Hadoop Engine: Select your preferred engine.

    • This is a secure engine: This option can be left as is.

    • Engine Name: Enter a name for the engine.

    • Resource Manager: Provide the Resource manager address of the Hadoop environment.

    • NameNode: Provide the NameNode address of the Hadoop environment.

    • Spill Directories: Provide a location for Dremio to spill operators like Aggs and Sorts.

    • Queue: Specify the Yarn queue where Dremio containers will be created.

    • Workers: The number of Dremio executors (containers) to be created on the Hadoop data nodes.

    • Cores per Worker: Number of CPUs per Dremio executor (container).

    • Memory per worker: Memory per Dremio executor (container).

    • Additional Properties: Specify any additional custom JVM arguments to be passed.

    • Cloud Cache: Location to write the Dremio Cloud Cache file.

  2. Click Add.

  3. Add sources to your deployment and you are ready to run queries.