On this page

    Source Type Configuration

    This topic specifies the source types that Dremio supports. A variety of source types are supported, each with their own custom configuration.

    The following source types are available:

    SourceSource Type
    Amazon RedshiftREDSHIFT
    Amazon S3S3
    Azure Data Lake Storage Gen1ADLS
    Azure StorageAZURE_STORAGE
    ElasticsearchELASTIC
    HDFSHDFS
    HiveHIVE
    MapR-FSMAPRFS
    Microsoft SQL ServerMSSQL
    MongoDBMONGO
    MySQLMYSQL
    NASNAS
    OracleORACLE
    PostgreSQLPOSTGRES

    Amazon Redshift

    REDSHIFT

    {
      "username": String,
      "password": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number,
      "connectionString": String
    }
    
    NameTypeDescription
    usernameStringRedshift user name.
    passwordStringRedshift password.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSizeNumberRecord fetch size, use 0 to have Dremio automatically decide.
    connectionStringStringConnection string.

    Amazon S3

    S3

    {
      "accessKey": String,
      "accessSecret": String,
      "secure": Boolean,
      "externalBucketList": [...String],
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    NameTypeDescription
    accessKeyStringAWS access key.
    accessSecretStringAWS access secret.
    secureBooleanWhether to enable SSL encryption.
    externalBucketListArrayA list of external buckets.
    propertyListArrayAn array of name/value pairs.

    Azure Data Lake Store (Gen1)

    ADL

    {
      "mode": "CLIENT_KEY",
      "accountName": String,
      "clientId": String,
      "clientKeyRefreshUrl": String,
      "clientKeyPassword": String
    }
    
    NameTypeDescription
    modeStringMust be set to CLIENT_KEY
    accountNameStringName of the Azure Data Lake Store resource
    clientIdStringApplication ID of the registered application under Azure Active Directory
    clientKeyRefreshUrlStringAzure Active Directory OAuth 2.0 Token Endpoint for registered applications
    clientKeyPasswordStringGenerated password value for the registered application

    Azure Storage

    AZURE_STORAGE

    Shared Access Key

    Azure Storage V2 or V1 source with Shared Access Key:

    "config": {
      "accountKind": String,
      "accountName": String,
      "accessKey": String,
      "enableSSL" : Boolean,
      "rootPath": String,
      "containers" : [... String]
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    NameTypeDescription
    accountKindStringSet to either “STORAGE_V1” or “STORAGE_V2”.
    accountNameStringName of the Azure storage account.
    accessKeyStringShared access key of the Azure storage account.
    enableSSLBooleanWhether to enable SSL encryption.
    rootPathStringSet to “/” to view all accessible containers. Alternatively, the path to a container can be specified instead of “/” to set it as the root, e.g. “/container-name”.
    containersArrayA list of containers to be accessed. When specified, only containers listed will be available for access.
    propertyListArrayAn array of name/value pairs.

    OAuth 2.0 Authentication

    Azure Storage V2 or V1 source with OAuth 2.0 authentication:

    "config": {
      "accountKind": String,
      "accountName": String,
      "credentialsType": String,
      "clientId": String,
      "tokenEndpoint": String,
      "clientSecret": String
      "enableSSL" : Boolean,
      "rootPath": String,
      "containers" : [... String]
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    NameTypeDescription
    accountKindStringSet to either “STORAGE_V1” or “STORAGE_V2”.
    accountNameStringName of the Azure storage account.
    credentialsTypeStringMust be set to “AZURE_ACTIVE_DIRECTORY” when using OAuth 2.0.
    clientIdStringApplication (Client) ID.
    tokenEndpointStringOAuth 2.0 token endpoint (v1.0).
    clientSecretStringSecret key generated in Azure Active Directory application.
    enableSSLBooleanMust be set to true when using OAuth 2.0.
    rootPathStringSet to “/” to view all accessible containers. Alternatively, the path to a container can be specified instead of “/” to set it as the root, e.g. “/container-name”.
    containersArrayA list of containers to be accessed. When specified, only containers listed will be available for access.
    propertyListArrayAn array of name/value pairs.

    Elasticsearch

    ELASTIC

    {
      "username": String,
      "password": String,
      "hostList": [
         {"hostname": String, "port": Number},
         ...
      ],
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "scriptsEnabled": Boolean [optional],
      "showHiddenIndices": Boolean [optional],
      "sslEnabled": Boolean [optional],
      "showIdColumn": Boolean [optional],
      "readTimeoutMillis": Number,
      "scrollTimeoutMillis": Number,
      "usePainless": Boolean [optional],
      "useWhitelist": Boolean [optional],
      "scrollSize": Number [optional]
    }
    
    NameTypeDescription
    usernameStringElasticsearch user name.
    passwordStringElasticsearch password.
    hostListArrayA list of Elasticsearch hosts.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    scriptsEnabledBooleanAre scripts enabled in Elasticsearch, optional.
    showHiddenIndicesBooleanWhether to show hidden indices. optional.
    sslEnabledBooleanWhether to use SSL connections, optional.
    showIdColumnBooleanWhether to show the ID column, optional.
    readTimeoutMillisNumberRead timeout in milliseconds.
    scrollTimeoutMillisNumberScroll timeout in milliseconds.
    usePainlessBooleanWhether to use the Painless scripting language when connecting to Elasticsearch 5.0+ (experimental), optional.
    useWhitelistBooleanWhether to only query the specified hosts in hostList, optional.
    scrollSizeNumberElasticsearch scroll size, optional.

    HDFS

    HDFS

    {
      "enableImpersonation": Boolean,
      "hostname": String,
      "port": Number,
      "rootPath": String,
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    NameTypeDescription
    enableImpersonationBooleanEnable impersonation.
    hostnameStringHDFS server host name.
    portNumberHDFS server port number.
    rootPathStringRoot path for the HDFS source.
    propertyListArrayAn array of name/value pairs.

    Hive

    HIVE

    {
      "hostname": String,
      "port": String,
      "kerberosPrincipal": String,
      "enableSasl": Boolean [optional],
      "propertyList": [
        {"name": String, "value": String},
        ...
      ]
    }
    
    NameTypeDescription
    hostnameStringHive host name.
    portNumberHive port number.
    kerberosPrincipalStringKerberos principal.
    enableSaslBooleanEnable SASL, optional.
    propertyListArrayAn array of name/value pairs.

    MapR-FS

    MAPRFS

    {
      "clusterName": String,
      "enableImpersonation": Boolean,
      "secure": Boolean,
      "rootPath": String,
      "propertyList": [
         {"name": String, "value": String},
         ...
      ]
    }
    
    NameTypeDescription
    clusterNameStringCluster name.
    enableImpersonationBooleanEnable impersonation.
    secureBooleanWhether the cluster is secure or not.
    rootPathStringRoot path for the MarR-FS source.
    propertyListArrayAn array of name/value pairs.

    Microsoft SQL Server

    MSSQL

    {
      "username": String,
      "password": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number,
      "database": String [optional],
      "showOnlyConnectiondatabase": Boolean [optional]
    }
    
    NameTypeDescription
    usernameStringSQL Server user name.
    passwordStringSQL Server password.
    hostnameStringSQL Server host name.
    portNumberSQL Server port number.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSizeNumberRecord fetch size, use 0 to have Dremio automatically decide.
    databaseStringDatabase name, optional.
    showOnlyConnectiondatabaseBooleanShow only the initial database used for connecting.

    MongoDB

    MONGO

    {
      "username": String,
      "password": String,
      "hostList": [
         {"hostname": String, "port": Number},
         ...
      ],
      "useSsl": Boolean,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "authDatabase": String,
      "authenticationTimeoutMillis": Number,
      "secondaryReadsOnly": Boolean,
      "subpartitionSize": Number,
      "propertyList": [
        {"name": String, "value": String},
        ...
      ]
    }
    
    NameTypeDescription
    usernameStringMongo user name.
    passwordStringMongo password.
    hostListArrayA list of Mongo hosts.
    useSslBooleanForce SSL connection.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    authDatabaseStringAuthentication database.
    authenticationTimeoutMillisNumberAuthentication time in milliseconds.
    secondaryReadsOnlyBooleanRead from secondaries only.
    subpartitionSizeNumberNumber of records to be read by query fragments.
    propertyListArrayAn array of name/value pairs.

    MySQL

    MYSQL

    {
      "username": String,
      "password": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number
    }
    
    NameTypeDescription
    usernameStringMySQL user name.
    passwordStringMySQL password.
    hostnameStringMySQL server host name.
    portNumberMySQL server port number.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSizeNumberRecord fetch size, use 0 to have Dremio automatically decide.

    NAS

    NAS

    {
      "path": String
    }
    
    NameTypeDescription
    pathStringPath on the filesystem to use as the root for the source.

    Oracle

    ORACLE

    {
      "username": String,
      "password": String,
      "instance": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number
    }
    
    NameTypeDescription
    usernameStringOracle user name.
    passwordStringOracle password.
    instanceStringOracle server SID.
    hostnameStringOracle server host name.
    portNumberOracle server port number.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSizeNumberRecord fetch size, use 0 to have Dremio automatically decide.

    PostgreSQL

    POSTGRES

    {
      "username": String,
      "password": String,
      "hostname": String,
      "port": String,
      "authenticationType": String ["ANONYMOUS", "MASTER"],
      "fetchSize": Number,
      "databaseName": String
    }
    
    NameTypeDescription
    usernameStringPostgres user name.
    passwordStringPostgres password.
    hostnameStringPostgres host name.
    portNumberPostgres port number.
    authenticationTypeStringWhich authentication type to use, must be either ANONYMOUS or MASTER.
    fetchSizeNumberRecord fetch size, use 0 to have Dremio automatically decide.
    databaseNameStringDatabase name.

    Deprecated Sources

    The following data source types are deprecated and no longer supported.

    • HBase

    • IBM DB2