On this page

    Getting Started with Dremio Arctic preview

    Welcome to the Getting Started guide for Dremio Arctic! In order to use Arctic, you need a Dremio Cloud account, also known as an organization. If you don’t already have a Dremio Cloud organization, see Signing up for Dremio Cloud.

    Dremio Cloud sets up a default project, which enables you to use Dremio Sonar to query data. To use Arctic during Preview, you need to create a new project with Arctic enabled. The new project will include a built-in Nessie repository that you can access with Dremio Sonar, Apache Spark, Apache Flink, Apache Hive, and other engines.

    What This Guide Covers

    To start using Arctic, you can either use the default Dremio Sonar project that was created when you signed up for Dremio Cloud, another Sonar project that you may have, or connect from another supported engine, such as Apache Spark, Apache Flink, and Apache Hive.

    Your Dremio Cloud account also provides you with a sample data lake, which includes a number of sample datasets that you can practice with. To learn more about this sample data lake and connection instructions, see Add Dremio’s Samples Data Lake to Your Project.

    After completing the set up steps, you are ready to use Arctic. Review the available SQL Commands for Nessie that you can use to create and manage your branches and commits in Arctic.

    Limitations of Dremio Arctic Preview

    • A table containing parallel changes on both source and destination branches cannot be merged. The workaround is to re-branch from the destination branch’s head reference point, reapply your changes to that table, and retry the merge. For more information about working with branches in Nessie, see Branches on the Project Nessie website.
    • Arctic views limitations:
      • Reflections are not supported.
      • The graph dataset component is not supported.
      • Wikis and Tags are not supported.
      • Views are not yet interoperable with other engines.
      • When a view is created, the original version context is not saved/persisted in the view definition (that is, the view definition does not contain a field for the version information). As a result, the context where the view is executed may yield different results compared to when it was created.
      • Folders that are created either directly or implicitly as part of a table path cannot be deleted.

    Step 1. Adding an Arctic-enabled Project

    To create a new Arctic-enabled project:

    1. On the side navigation bar, click the Settings (gear) icon > Organization Settings.

    2. From the Organization Settings menu, select Projects.

    3. On the Projects page, upper-right, select Add Project.

    4. On the Add Project dialog box:

      1. Enter a Project Name.

      2. Under Project Services, click Arctic (Preview).

      3. (Optional) Under Cloud, if there are additional Cloud options, select the Cloud.

        For information about Clouds, see Managing Clouds.

      4. Under Storage Settings:

        1. Under Project Store, provide the URL to the Amazon S3 bucket where you want to store the metadata. For example, bucket-name/optional/folder/path. For information about identifying the URL for your S3 bucket, see Methods for accessing a bucket.

        2. Under Project Data Credentials, select either an Access Key or IAM Role.

          • If using an Access Key, enter your AWS Access Key ID and AWS Secret Access Key. For information about these AWS keys, see Managing access keys for IAM users.

          • If using an IAM Role, enter your AWS Cross-Account Role ARN and AWS Instance Profile ARN. For information about the AWS Instance Profile ARN, see Managing Projects.

    5. (Optional) Click Test to verify all the information you entered is valid.

    6. Click Save.


    Step 2. Connecting an Engine to Dremio Arctic

    Currently, Arctic is in Preview. During this time, you can use Arctic with Dremio Sonar, Apache Spark, Apache Flink, and Apache Hive.

    Compatibility

    • Arctic uses Nessie 0.21.2. See Nessie’s Compatibility table for the versions of Spark, Flink, and Hive that are supported.
    • Arctic supports Iceberg v0.13.1+

    Prerequisites

    In order to connect an engine to Arctic, you need a Dremio Cloud organization. If you don’t already have Dremio Cloud, see Signing up for Dremio Cloud.

    Next, you need to have an Arctic-enabled project available. If you need to set up Arctic, see Adding an Arctic Project.

    You also need to generate a personal access token, which provides the authentication needed to connect an engine to Arctic. If you have not created a personal access token before, see Personal Access Tokens for information about how Dremio Cloud uses these tokens and how to generate one.

    You also need to retrieve the Nessie Endpoint ID, which is needed to connect an engine to Arctic. To retrieve the Nessie Endpoint ID:

    1. From the home screen, side navigation bar, click the Project Selector icon and select the name of the Arctic project that you want to work with.
    2. From the side navigation bar, select the Settings (gear) icon > Project Settings.
    3. On the Project Settings page, General Information tab, locate the Nessie Endpoint URL and use the copy button to copy the URL.
    4. Save the copied URL for later use. You need to provide this URL when you connect an engine to Arctic.

    Connect an Engine

    Select the engine that you want to connect to Arctic: