On this page

    Source Type Configuration

    This topic specifies the source types that Dremio supports. A variety of source types are supported, each with their own custom configuration.

    The following source types are available:

    Source Source Type
    Amazon Redshift REDSHIFT
    Amazon S3 S3
    Azure Data Lake Storage Gen1 ADLS
    Azure Storage AZURE_STORAGE
    Elasticsearch ELASTIC
    HDFS HDFS
    Hive HIVE
    MapR-FS MAPRFS
    Microsoft SQL Server MSSQL
    MongoDB MONGO
    MySQL MYSQL
    NAS NAS
    Oracle ORACLE
    PostgreSQL POSTGRES

    Amazon Redshift

    Redshift source
    {
      "username": String,
      "password": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number,
      "connectionString": String
    }
    
    Name Type Description
    username String Redshift user name.
    password String Redshift password.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSize Number Record fetch size, use 0 to have Dremio automatically decide.
    connectionString String Connection string.

    Amazon S3

    S3 source
    {
      "accessKey": String,
      "accessSecret": String,
      "secure": Boolean,
      "externalBucketList": [...String],
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    Name Type Description
    accessKey String AWS access key.
    accessSecret String AWS access secret.
    secure Boolean Whether to enable SSL encryption.
    externalBucketList Array A list of external buckets.
    propertyList Array An array of name/value pairs.

    Azure Data Lake Store (Gen1)

    Azure Data Lake Store Gen1 source
    {
      "mode": "CLIENT_KEY",
      "accountName": String,
      "clientId": String,
      "clientKeyRefreshUrl": String,
      "clientKeyPassword": String
    }
    
    Name Type Description
    mode String Must be set to CLIENT_KEY
    accountName String Name of the Azure Data Lake Store resource
    clientId String Application ID of the registered application under Azure Active Directory
    clientKeyRefreshUrl String Azure Active Directory OAuth 2.0 Token Endpoint for registered applications
    clientKeyPassword String Generated password value for the registered application

    Azure Storage

    Shared Access Key

    Azure Storage V2 or V1 source with shared access key
    "config": {
      "accountKind": String,
      "accountName": String,
      "accessKey": String,
      "enableSSL" : Boolean,
      "rootPath": String,
      "containers" : [... String]
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    Name Type Description
    accountKind String Set to either “STORAGE_V1” or “STORAGE_V2”.
    accountName String Name of the Azure storage account.
    accessKey String Shared access key of the Azure storage account.
    enableSSL Boolean Whether to enable SSL encryption.
    rootPath String Set to “/” to view all accessible containers. Alternatively, the path to a container can be specified instead of “/” to set it as the root, e.g. “/container-name”.
    containers Array A list of containers to be accessed. When specified, only containers listed will be available for access.
    propertyList Array An array of name/value pairs.

    OAuth 2.0 Authentication

    Azure Storage V2 or V1 source with OAuth 2.0 authentication:

    Azure Storage V2 or V1 source with OAuth 2.0
    "config": {
      "accountKind": String,
      "accountName": String,
      "credentialsType": String,
      "clientId": String,
      "tokenEndpoint": String,
      "clientSecret": String
      "enableSSL" : Boolean,
      "rootPath": String,
      "containers" : [... String]
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    Name Type Description
    accountKind String Set to either “STORAGE_V1” or “STORAGE_V2”.
    accountName String Name of the Azure storage account.
    credentialsType String Must be set to “AZURE_ACTIVE_DIRECTORY” when using OAuth 2.0.
    clientId String Application (Client) ID.
    tokenEndpoint String OAuth 2.0 token endpoint (v1.0).
    clientSecret String Secret key generated in Azure Active Directory application.
    enableSSL Boolean Must be set to true when using OAuth 2.0.
    rootPath String Set to “/” to view all accessible containers. Alternatively, the path to a container can be specified instead of “/” to set it as the root, e.g. “/container-name”.
    containers Array A list of containers to be accessed. When specified, only containers listed will be available for access.
    propertyList Array An array of name/value pairs.

    Elasticsearch

    Elasticsearch source
    {
      "username": String,
      "password": String,
      "hostList": [
         {"hostname": String, "port": Number},
         ...
      ],
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "scriptsEnabled": Boolean [optional],
      "showHiddenIndices": Boolean [optional],
      "sslEnabled": Boolean [optional],
      "showIdColumn": Boolean [optional],
      "readTimeoutMillis": Number,
      "scrollTimeoutMillis": Number,
      "usePainless": Boolean [optional],
      "useWhitelist": Boolean [optional],
      "scrollSize": Number [optional]
    }
    
    Name Type Description
    username String Elasticsearch user name.
    password String Elasticsearch password.
    hostList Array A list of Elasticsearch hosts.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    scriptsEnabled Boolean Are scripts enabled in Elasticsearch, optional.
    showHiddenIndices Boolean Whether to show hidden indices. optional.
    sslEnabled Boolean Whether to use SSL connections, optional.
    showIdColumn Boolean Whether to show the ID column, optional.
    readTimeoutMillis Number Read timeout in milliseconds.
    scrollTimeoutMillis Number Scroll timeout in milliseconds.
    usePainless Boolean Whether to use the Painless scripting language when connecting to Elasticsearch 5.0+ (experimental), optional.
    useWhitelist Boolean Whether to only query the specified hosts in hostList, optional.
    scrollSize Number Elasticsearch scroll size, optional.

    HDFS

    HDFS source
    {
      "enableImpersonation": Boolean,
      "hostname": String,
      "port": Number,
      "rootPath": String,
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    Name Type Description
    enableImpersonation Boolean Enable impersonation.
    hostname String HDFS server host name.
    port Number HDFS server port number.
    rootPath String Root path for the HDFS source.
    propertyList Array An array of name/value pairs.

    Hive

    Hive example
    {
      "hostname": String,
      "port": String,
      "kerberosPrincipal": String,
      "enableSasl": Boolean [optional],
      "propertyList": [
        {"name": String, "value": String},
        ...
      ]
    }
    
    Name Type Description
    hostname String Hive host name.
    port Number Hive port number.
    kerberosPrincipal String Kerberos principal.
    enableSasl Boolean Enable SASL, optional.
    propertyList Array An array of name/value pairs.

    MapR-FS

    MapR-FS source example
    {
      "clusterName": String,
      "enableImpersonation": Boolean,
      "secure": Boolean,
      "rootPath": String,
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    Name Type Description
    clusterName String Cluster name.
    enableImpersonation Boolean Enable impersonation.
    secure Boolean Whether the cluster is secure or not.
    rootPath String Root path for the MarR-FS source.
    propertyList Array An array of name/value pairs.

    Microsoft SQL Server

    Microsoft SQL Server source example
    {
      "username": String,
      "password": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number,
      "database": String [optional],
      "showOnlyConnectiondatabase": Boolean [optional]
    }
    
    Name Type Description
    username String SQL Server user name.
    password String SQL Server password.
    hostname String SQL Server host name.
    port Number SQL Server port number.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSize Number Record fetch size, use 0 to have Dremio automatically decide.
    database String Database name, optional.
    showOnlyConnectiondatabase Boolean Show only the initial database used for connecting.

    MongoDB

    MongoDB source example
    {
      "username": String,
      "password": String,
      "hostList": [
         {"hostname": String, "port": Number},
         ...
      ],
      "useSsl": Boolean,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "authDatabase": String,
      "authenticationTimeoutMillis": Number,
      "secondaryReadsOnly": Boolean,
      "subpartitionSize": Number,
      "propertyList": [
        {"name": String, "value": String},
        ...
      ]
    }
    
    Name Type Description
    username String Mongo user name.
    password String Mongo password.
    hostList Array A list of Mongo hosts.
    useSsl Boolean Force SSL connection.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    authDatabase String Authentication database.
    authenticationTimeoutMillis Number Authentication time in milliseconds.
    secondaryReadsOnly Boolean Read from secondaries only.
    subpartitionSize Number Number of records to be read by query fragments.
    propertyList Array An array of name/value pairs.

    MySQL

    MySQL source example
    {
      "username": String,
      "password": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number
    }
    
    Name Type Description
    username String MySQL user name.
    password String MySQL password.
    hostname String MySQL server host name.
    port Number MySQL server port number.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSize Number Record fetch size, use 0 to have Dremio automatically decide.

    NAS

    NAS source example
    {
      "path": String
    }
    
    Name Type Description
    path String Path on the filesystem to use as the root for the source.

    Oracle

    Oracle source example
    {
      "username": String,
      "password": String,
      "instance": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number
    }
    
    Name Type Description
    username String Oracle user name.
    password String Oracle password.
    instance String Oracle server SID.
    hostname String Oracle server host name.
    port Number Oracle server port number.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSize Number Record fetch size, use 0 to have Dremio automatically decide.

    PostgreSQL

    PostgreSQL source example
    {
      "username": String,
      "password": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number,
      "databaseName": String
    }
    
    Name Type Description
    username String Postgres user name.
    password String Postgres password.
    hostname String Postgres host name.
    port Number Postgres port number.
    authenticationType String Which authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSize Number Record fetch size, use 0 to have Dremio automatically decide.
    databaseName String Database name.

    Deprecated Sources

    The following data source types are deprecated and no longer supported.

    • HBase

    • IBM DB2