On this page

    MongoDB

    Supported Versions

    • Dremio supports MongoDB 3.6 through 4.2.

    Limitations

    Queries that un-nest nested fields is not allowed as this would cause incorrect schemas. This may be easily circumvented by pushing filters into the subquery or simply not referencing the alias.

    Dremio Configuration

    General

    Connection

    NameDescription
    HostsA list of Mongo hosts. If MongoDB is sharded, enter the mongos hosts. Otherwise, enter the mongod host.
    PortA list of Mongo port numbers. Defaults to 27017.
    • Encrypt connection – Forces an encrypted connection over SSL.
    • Read from secondaries only – Disables reading from primaries. Might degrade performance.

    Authentication

    • No authentication method
    • Master Authentication method (default)
      • Username – MongoDB user name
      • Password – MongoDB password
    • Authentication database – Database to authenticate against.

    Advanced Options

    • Subpartition size – Number of records to be read by query fragments. This option can be used to increase query parallelism.
    • Auth Timeout (millis) – Authentication timeout in milliseconds.
    • Connection Properties – A list of additional MongoDB connection parameters.

    Reflection Refresh

    • Never refresh – Specifies how often to refresh based on hours, days, weeks, or never.
    • Never expire – Specifies how often to expire based on hours, days, weeks, or never.

    Metadata

    Dataset Handling

    • Remove dataset definitions if underlying data is unavailable (Default).
      If this box is not checked and the underlying files under a folder are removed or the folder/source is not accessible, Dremio does not remove the dataset definitions. This option is useful in cases when files are temporarily deleted and put back in place with new sets of files.

    Metadata Refresh

    • Dataset Discovery – Refresh interval for top-level source object names such as names of DBs and tables.
      • Fetch every – Specify fetch time based on minutes, hours, days, or weeks. Default: 1 hour
    • Dataset Details – The metadata that Dremio needs for query planning such as information needed for fields, types, shards, statistics, and locality.
      • Fetch mode – Specify either Only Queried Datasets, All Datasets, or As Needed. Default: Only Queried Datasets
        • Only Queried Datasets – Dremio updates details for previously queried objects in a source.
          This mode increases query performance because less work is needed at query time for these datasets.
        • All Datasets – Dremio updates details for all datasets in a source. This mode increases query performance because less work is needed at query time.
        • As Needed – Dremio updates details for a dataset at query time. This mode minimized metadata queries on a source when not used, but might lead to longer planning times.
      • Fetch every – Specify fetch time based on minutes, hours, days, or weeks. Default: 1 hour
      • Expire after – Specify expiration time based on minutes, hours, days, or weeks. Default: 3 hours

    Sharing

    You can specify which users can edit. Options include:

    • All users can edit.
    • Specific users can edit.

    For More Information