Skip to main content
Version: 24.3.x

Configuring Metadata Storage

This topic describes how to configure a custom Dremio metadata storage location.

Metadata storage is configured with the paths.local property in the dremio.conf file. This property specifies the directory where Dremio holds metadata about users, spaces and datasets.

Default Location

Default location: ${DREMIO_HOME}"/data directory.

  • For an RPM installation, the default metadata storage location is create for you at /var/lib/dremio, however, you can change this location by setting up a custom location.
  • For a Tarball installation, the default is where you extracted Dremio and in the /data sub-directory.

Prerequisites

If you set up a shared network drive:

  • Provide a network drive (NFS) with locking support.
  • Ensure that the store is high-speed, low latency (for spilling operations purposes).
  • Ensure that all Dremio coordinator nodes have read/write access to the shared network drive.
  • Ensure that the guidelines of the shared network drive are followed for consistent synchronous writes.
note

High Availability: For HA, Dremio's metadata storage must be an external store. See Distributed File System (NAS) Requirements and Recommendations for information on configuring HA for Dremio metadata storage.

Disk Space Recommendations

Dremio requires a minimum volume size of 512 GB for the KV store. The administrator should monitor the volume for available space and usage. The KV store is cleaned with the dremio-admin clean command.

Setting Up Metadata Storage

To setup a custom metadata storage location:

  1. Create your custom directory if it doesn't exist, for example: /data/customDremio

    sudo mkdir /data/customDremio && sudo chown dremio:dremio /data/customDremio
  2. Add the new location to the dremio.conf file in the local field under paths. This is done in the dremio.conf file on all the Dremio coordinator nodes(s).

    paths: {
    local: "/data/customDremio"
    }

Troubleshooting

  • If HA fails when the network is brought down on the running master coordinator node, there may be an issue with the mount.
    For data consistency, your NFS should be mounted as a hard mount. For example:
    mount -t nfs -o rw,hard,sync,nointr,vers=4,proto=tcp <server>:<share> <mount path>