Skip to main content
Version: 24.3.x

Implementing Garbage-First (G1) Garbage Collection

This topic describes how to implement Garbage-First (G1) garbage collection. G1 is a server-style garbage collector for multi-processor machines with large memories.

Scenario: Dremio executors are not deployed on Yarn

To implement G1 garbage collection and heap dump flags when Dremio executors are not on Yarn,
add the following property to the dremio-env file on all Dremio coordinator and executor nodes.

Property to add to dremio-env file
DREMIO_JAVA_SERVER_EXTRA_OPTS="-XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=25"

By default, the GC logs and heap dumps are stored under the Dremio log folder. The following is an example:

501  3479     1   0 Mon01PM ttys001   36:38.94 /usr/bin/java 
-Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler
-Djava.library.path=/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/lib
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps -Xloggc:/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/log/server.gc
-Ddremio.log.path=/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/log
-Xmx4096m
-XX:MaxDirectMemorySize=8192m
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/log
-Dio.netty.maxDirectMemory=0
-DMAPR_IMPALA_RA_THROTTLE
-DMAPR_MAX_RA_STREAMS=400
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:MaxGCPauseMillis=500
-XX:InitiatingHeapOccupancyPercent=25
-cp /Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/conf:/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/jars/*:/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/jars/ext/*:/Users/dremio/Downloads/dremio-enterprise-3.2.0-201905102005330382-0598733/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon

Scenario: Dremio executors are deployed on Yarn

To implement G1 garbage collection and heap dump flags when Dremio executors are deployed on Yarn,
add the following properties to the data source via the Dremio UI > Advanced Options > Connection Properties.

Property TypeNameValue
JAVAprovisioning.yarn.heapsize8192
SYSTEM-Xloggc:/<path-to-gc-logs>/gc.log-date +'%Y%m%d%H%M'
SYSTEM-XX:+UseG1GC
SYSTEM-XX:+HeapDumpOnOutOfMemoryError
SYSTEM-XX:HeapDumpPath<path>
SYSTEM-XX:+PrintGCDetails
SYSTEM-XX:+PrintGCTimeStamps
SYSTEM-XX:+PrintGCDateStamps
SYSTEM-XX:ErrorFile<path>/hs_err_pid%p.log
SYSTEM-XX:G1HeapRegionSize32M
SYSTEM-XX:MaxGCPauseMillis500
SYSTEM-XX:InitiatingHeapOccupancyPercent25

Note:
For Kubernetes-based deployments, garbage collection is done via the Helm charts.