On this page

    System Requirements

    This topic covers system requirements for standalone clusters including general requirements for Hadoop on YARN and MapR on YARN deployments.

    Hadoop Distributions

    Dremio supports the following Hadoop distributions:

    • Apache Hadoop 2.7.2+ and 3+
    • Hortonworks HDP 2.6.5 to HDP 3.0.x
    • Cloudera CDH 5 and 6
    • Cloudera Data Platform 7.1+
    • MapR 5.2.x and 6.1.x

    Operating Systems

    Dremio supports the following distributions and versions of Linux:

    • RHEL and CentOS 6.7+, 7.3+, and 8.3 (RPM and tarball)
    • SLES 12 SP2+ (tarball)
    • Ubuntu 14.04+ (tarball)
    • Debian 7+ (tarball)

    Java Development Kit

    Dremio requires that Java SE 8 (JDK 1.8) or Java SE 11 (JDK 1.11) be installed. Supported distributions are OpenJDK and Oracle JDK. Other versions are not currently supported.

    note:

    Java SE 11 is supported only with Dremio 20.0 and higher and only with certain deployment models. See the Dremio 20.0 Release Notes for more information.

    • OpenJDK downloads can be found here.

    • Oracle JDK downloads can be found at JDK 8 and JDK 11.

    note:

    The glibc (GNU C Library) implementation of libc is required for OpenJDK. Alpine Linux may use musl, which requires glibc to be installed.

    Dremio utilizes the Java compiler (javac) for runtime code generation. You can check to see if your operating system has Java installed (and which version) with this command:

    $ java -version
    

    Browsers

    The Dremio UI works best with the following browsers:

    • Google Chrome 54+
    • Apple Safari 11+
    • Mozilla Firefox 50+
    • Microsoft Edge 14+

    Server or Instance Hardware

    The following hardware are minimum recommendations based on the Dremio service enabled on the node.

    Dremio Node RoleHardware Required
    Master-Coordinator8 CPU cores recommended
    16GB RAM recommended
    Executor4 CPU core minimum (16 cores recommended)
    16GB RAM minimum (128GB recommended)

    Note: Even if you have a machine with 64GB of RAM, only 16GB is used by default. To change this setting, modify the DREMIO_MAX_DIRECT_MEMORY_SIZE_MB property in the dremio-env file and restart the executor node(s).

    Network

    There should be a low-latency, high-bandwidth network connection between Dremio and the data sources.

    The following ports must be open:

    PurposePortFromTo
    UI (HTTPS)9047Corporate network (end users)Coordinators
    Arrow Flight32010Corporate network (end users)Coordinators
    ODBC/JDBC clients (e.g., Tableau, Power BI)31010Corporate network (end users)Coordinators
    ZooKeeper (internal)2181Other Dremio nodes (coordinators and executors)Coordinators
    Inter-node communication45678Other Dremio nodesAll Dremio nodes
    ConduitephemeralCoordinators and ExecutorsCoordinators and Executors
    Data source readsVariesAll Dremio nodesData source nodes

    Configuring the Conduit Port

    Dremio uses an ephemeral network port allocated by the operating system for inter-node communication between coordinators and executors. To assign a static port number to the conduit port, configure services.conduit.port in the Dremio configuration file. If TLS is enabled on your deployment, Dremio applies the same configuration to communications using the conduit port. To use a different configuration or to enable TLS for only the new conduit port, specify all values for services.conduit.ssl in dremio.conf.

    Web Sockets

    Dremio uses Web Sockets. If you encounter the following error message: “Your Internet connection may be offline, or WebSockets to Dremio are being blocked.", ensure that your environment allows WebSocket communication.

    Performance

    A 10 GbE network is recommended when connecting to large data sources that hold terabytes or petabytes of data.

    In particular, for maximum performance, it is recommended to use a 10 GbE network between coordinators and executors, executors and executors, and executors and data sources.

    Privileges

    To install Dremio, the following access privileges are required:

    • ssh and scp access
    • root or sudo privileges

    Best Practices

    • For Unix/Linux operating systems, increase your open file limit for users (this impacts the Dremio processes) to 65536.

    • If you have a machine with a large amount of RAM (for example, 64GB), increase Dremio’s default settings for either the direct memory RAM setting or the HEAP setting. A recommended HEAP value is 8GB.

      To increase Dremio’s RAM setting:

      1. In the dremio-env file, modify either the DREMIO_MAX_DIRECT_MEMORY_SIZE_MB or DREMIO_MAX_HEAP_MEMORY_SIZE_MB property. See Configuring Memory for more information.
      2. Restart the executor node(s).

    warning:

    For the DREMIO_MAX_DIRECT_MEMORY_SIZE_MB allocation, be sure to leave at least 1-2 GB of memory for the OS.

    • To verify the memory assigned to the nodes (heap and direct), run the following query: SELECT * FROM sys.memory