Skip to main content

Catalogs

A catalog is a metadata repository that maintains information about datasets and their structure. Catalogs enable consistent discovery, organization, and query execution by storing metadata independently of the underlying data.

A catalog typically includes:

  • Tables and views – definitions, schemas, columns, and data types
  • Data locations – where files are stored in object storage
  • Metadata – partitioning, file formats, and statistics

Dremio can connect to external catalogs to provide a unified metadata layer across platforms. This allows users to query existing datasets in place, without data movement, while preserving a single source of truth for metadata management.

Catalog Comparison

CatalogProviderReadWriteVended CredentialsBest For
Dremio's Open CatalogDremioOpen lakehouses
Iceberg REST CatalogVariousVariesThird-party Iceberg catalogs
Snowflake Open CatalogSnowflake✔*Snowflake-managed Iceberg tables
Unity CatalogDatabricksDatabricks Delta Lake with UniForm
AWS Glue Data CatalogAWSAWS-native Iceberg environments

*Write supported for external catalogs only

Dremio's Open Catalog

Every Dremio project includes a built-in Open Catalog for managing your Iceberg tables. You can also connect to catalogs from other projects in your organization for cross-project collaboration.

Key features:

  • Open and standards-based catalog for Apache Iceberg
  • Automatic table maintenance with compaction and vacuuming
  • Built-in access controls enforced at the catalog level
  • Multi-engine compatibility via Iceberg REST API

Best for: Teams working with Apache Iceberg who want automated maintenance and multi-engine access without vendor lock-in.

AWS Glue Data Catalog

Connect to AWS Glue's managed metadata catalog for accessing Iceberg tables stored in Amazon S3.

Key features:

  • Native integration with AWS ecosystem
  • Managed metadata storage and schema management
  • Support for both read and write operations
  • Integration with AWS Lake Formation for fine-grained access control

Best for: AWS-native environments that use Glue for metadata management and want to query Iceberg tables with Dremio.

Iceberg REST Catalog

Connect to any Iceberg Catalog implementing the REST API specification, including Apache Polaris, AWS Glue Data Catalog, Snowflake Open Catalog, Amazon S3 tables, and Confluent Tableflow.

Key features:

  • Universal compatibility with REST-compliant Iceberg catalogs
  • Support for multiple authentication mechanisms
  • Flexible storage credential management
  • Connect to on-premises Dremio clusters for hybrid cloud analytics

Best for: Connecting to Iceberg catalogs from other vendors or on-premises systems.

Snowflake Open Catalog

Connect to Snowflake's managed service for Apache Polaris to read and write Iceberg tables across Snowflake and other compatible engines.

Key features:

  • Read from internal and external Snowflake Open Catalogs
  • Write to external Snowflake Open Catalogs
  • Credential vending for secure storage access
  • Support for AWS, Azure, and GCS storage

Best for: Organizations using Snowflake that want to query Iceberg tables with Dremio while leveraging Snowflake's catalog management.

Unity Catalog

Connect to Databricks Unity Catalog to query Delta Lake tables through the UniForm Iceberg compatibility layer.

Key features:

  • Read Delta Lake tables via UniForm Iceberg metadata layer
  • Integration with Databricks governance and security
  • Support for AWS, Azure, and GCS storage
  • Credential vending for secure access

Best for: Databricks users who want to query Delta Lake tables with Dremio using the UniForm compatibility layer.