Skip to main content
Version: 24.3.x

System Requirements

This topic covers system requirements for standalone clusters including general requirements for Hadoop on YARN and MapR on YARN deployments.

Hadoop Distributions

Dremio supports the following Hadoop distributions:

  • Apache Hadoop 2.7.2+ and 3+
  • Hortonworks HDP 2.6.5 to HDP 3.0.x
  • Cloudera CDH 5 and 6
  • Cloudera Data Platform 7.1+
  • MapR 6.2.0
note

Dremio 23.0.0+ supports only MapR 6.2.0. If you are running MapR 5.2.x or 6.1.x, you must upgrade to MapR 6.2.0 before upgrading to Dremio 23. Dremio releases up to and including 22.x do not support MapR 6.2.0, only MapR 5.2.x and 6.1.x are supported in releases prior to Dremio 23.

JDK: MapR 6.2.0 supports only JDK 11.

Operating Systems

Dremio supports the following distributions and versions of Linux:

  • RHEL 6.7+, 7.3+, and 8.3+ (RPM and tarball)
  • CentOS 6.7+ and 7.3+ (RPM and tarball)
  • SLES 12 SP2+ (tarball)
  • Ubuntu 14.04+ (tarball)
  • Debian 7+ (tarball)

Java Development Kit

For versions 20.x through 24.x, Dremio requires Java SE 8 or 11 (JDK 8 or 11). Dremio supports all major OpenJDK distributions, including those from Adoptium (Eclipse Temurin), Fedora, Red Hat, and Oracle. Non-OpenJDK Java distributions are not supported.

note

The glibc (GNU C Library) implementation of libc is required for OpenJDK. Alpine Linux may use musl, which requires glibc to be installed.

Dremio utilizes the Java compiler (javac) for runtime code generation. You can check to see if your operating system has Java installed (and which version) with this command:

Check Java version
java -version

Browsers

The Dremio UI works best with the following browsers:

  • Google Chrome 54+
  • Apple Safari 11+
  • Mozilla Firefox 50+
  • Microsoft Edge 14+

Server or Instance Hardware

Dremio typically requires a minimum of 16 CPU cores and 128 GB RAM per node.

When you are onboarding, we will assist you in determining the number of nodes, as well as the number of coordinators, engines, and executors to place on those nodes.

After that initial setup, you must periodically ensure that the resources for your coordinators, engines, and executors continue to be appropriate for your query workloads. For best practices to help you do this, see Pillar 2 - Performance Efficiency of Dremio's Well-Architected Framework.

Network

There should be a low-latency, high-bandwidth network connection between Dremio and the data sources.

The following ports must be open:

PurposePortFromTo
UI (HTTPS)9047Corporate network (end users)Coordinators
Arrow Flight32010Corporate network (end users)Coordinators
ODBC/JDBC clients (e.g., Tableau, Power BI)31010Corporate network (end users)Coordinators
ZooKeeper (internal)2181Other Dremio nodes (coordinators and executors)Coordinators
Inter-node communication45678Other Dremio nodesAll Dremio nodes
ConduitephemeralCoordinators and ExecutorsCoordinators and Executors
Data source readsVariesAll Dremio nodesData source nodes

Configuring the Conduit Port

Dremio uses an ephemeral network port allocated by the operating system for inter-node communication between coordinators and executors. To assign a static port number to the conduit port, configure services.conduit.port in the Dremio configuration file. If TLS is enabled on your deployment, Dremio applies the same configuration to communications using the conduit port. To use a different configuration or to enable TLS for only the new conduit port, specify all values for services.conduit.ssl in dremio.conf.

Web Sockets

Dremio uses Web Sockets. If you encounter the following error message: "Your Internet connection may be offline, or WebSockets to Dremio are being blocked.", ensure that your environment allows WebSocket communication.

Performance

A 10 GbE network is recommended when connecting to large data sources that hold terabytes or petabytes of data.

In particular, for maximum performance, it is recommended to use a 10 GbE network between coordinators and executors, executors and executors, and executors and data sources.

Privileges

To install Dremio, the following access privileges are required:

  • ssh and scp access
  • root or sudo privileges

Additional Configuration

  • For Unix/Linux operating systems, increase your open file limit for users (this impacts the Dremio processes) to 65536.

  • Dremio software automatically determines the memory available on the system and allocate it between heap and direct memory based on the Dremio node type. If you believe that you need to adjust these levels, consult with Dremio. Afterward, you can implement recommendations by following the steps in Configuring Memory.

  • To verify the memory assigned to the nodes (heap and direct), run the following query: SELECT * FROM sys.memory