Amazon Elasticsearch Service

[info] Note If you are using Elasticsearch standalone (not on AWS), see Elasticsearch.

Compatibility

Dremio support the following versions:

  • Amazon Elasticsearch Service (version 5.x, 6.0, 6.2, and 6.3).

Dremio Configuration

General

Connection

Name Description
Host AWS Elasticsearch Host name.
Port Port on which the AWS Elasticsearch service is running (usually 443).

Authentication

  • AWS Access Key method -- Used for key-based authentication.
    • AWS Access Key -- AWS access key
    • AWS Access Secret -- AWS access secret.
  • EC2 Metadata method -- Dremio uses IAM policy from EC2 instance
  • No Authentication -- No Authentication: No credentials required.

Advanced Options

Elasticsearch options

  • Show hidden indices that start with a dot (.).
  • Use Painless scripting with Elasticsearch 5.0+ (Checked as a default).
  • Show _id columns.
  • Use index/doc fields when pushing down aggregates and filters on analyzed and normalized fields (may produce unexpected results).
  • Use scripts for query pushdown** (Checked as a default).
  • If the number of records returned from Elasticsearch is less than the expected number, warn instead of failing the query.
  • Read timeout (seconds) (default: 60)
  • Scroll timeout (seconds) (default: 300)
  • Scroll size -- This setting must be less than or equal to your Elasticsearch value for the index.max_result-window setting. (default: 4000)

Encryption

Validation modes include:

  • Validate certificate and hostname (default)
  • Validate certificate only
  • Do not validate certificate or hostname

AWS

  • Overwrite reqion -- If the box is checked, provide the region.

Reflection Refresh

  • Never refresh -- Specifies how often to refresh based on hours, days, weeks, or never.
  • Never expire -- Specifies how often to expire based on hours, days, weeks, or never.

    Metadata

Dataset Handling

  • Remove dataset definitions if underlying data is unavailable (Default).
    If this box is not checked and the underlying files under a folder are removed or the folder/source is not accessible, Dremio does not remove the dataset definitions. This option is useful in cases when files are temporarily deleted and put back in place with new sets of files.

Metadata Refresh

  • Dataset Discovery -- Refresh interval for top-level source object names such as names of DBs and tables.
    • Fetch every -- Specify fetch time based on minutes, hours, days, or weeks. Default: 1 hour
  • Dataset Details -- The metadata that Dremio needs for query planning such as information needed for fields, types, shards, statistics, and locality.
    • Fetch mode -- Specify either Only Queried Datasets, All Datasets, or As Needed. Default: Only Queried Datasets
      • Only Queried Datasets -- Dremio updates details for previously queried objects in a source.
        This mode increases query performance because less work is needed at query time for these datasets.
      • All Datasets -- Dremio updates details for all datasets in a source. This mode increases query performance because less work is needed at query time.
      • As Needed -- Dremio updates details for a dataset at query time. This mode minimized metadata queries on a source when not used, but might lead to longer planning times.
    • Fetch every -- Specify fetch time based on minutes, hours, days, or weeks. Default: 1 hour
    • Expire after -- Specify expiration time based on minutes, hours, days, or weeks. Default: 3 hours

      Sharing

You can specify which users can edit. Options include:

  • All users can edit.
  • Specific users can edit.

results matching ""

    No results matching ""