Skip to main content

Data Branching and Versioning

In addition to providing core data management capabilities, the Arctic catalog that is deployed with each Sonar project also enables you to manage your data with the same best practices Git enables for software development. For example, you can:

  • Create branches to make physical changes to data without disrupting production workloads, without requiring separate dev/test environments
  • Merge changes from a development branch into production only when data quality has been validated
  • Immediately undo changes and recover from mistakes
  • Reproduce models and analyses with catalog-level time travel

This enables you to eliminate infrastructure costs associated with duplicated environments and pipelines, give line of business users immediate access to fresh data, and immediately rollback from mistakes without data downtime. Data versioning also enables data analysts and data scientists to run experiments and models on their entire lakehouse without disturbing production workloads.

To learn more about what data branching is, core concepts, how it works, and sample use cases, visit the Arctic documentation.