On this page


    An R program can connect to a Dremio cluster through ODBC or JDBC, making it easy to load the data of a Dremio dataset or SQL query directly into an R dataframe.

    Install the Dremio Connector and Dremio JDBC Driver

    Install theDremio Connector (ODBC) or Dremio JDBC Driver.

    Using the RODBC Package

    The RODBC package enables R programs to utilize compliant ODBC drivers such as Dremio’s ODBC driver. For the hostname or IP address, enter the IP address or hostname for one of the coordinator nodes in your cluster.

    The following R program loads the dataset foo.bar.baz into a data frame and prints some basic statistics about the data using R’s summary function:

    if (!require(RODBC)) { install.packages(RODBC); require(RODBC) }
    dremio_host <- "<hostname or IP address>"
    dremio_port <- "31010"
    dremio_uid <- "<username>"
    dremio_pwd <- "<password>"
    channel <- odbcDriverConnect(sprintf("DRIVER=Dremio Connector;HOST=%s;PORT=%s;UID=%s;PWD=%s;AUTHENTICATIONTYPE=Basic Authentication;CONNECTIONTYPE=Direct", dremio_host, dremio_port, dremio_uid, dremio_pwd))
    df <- sqlQuery(channel, "SELECT * FROM foo.bar.baz")
    if (is.character(df)) { close(channel); stop(paste(df, collapse = "\n")) } # stop if query failed
    print(nrow(df)) # print # records returned
    df <- df[,sapply(df, class) != "ODBC_binary"] # remove binary columns (otherwise summary won't work)
    print(summary(df)) # print statistics