Build Your First Agentic Lakehouse Use Case
This guide will help you turn a business question into a working, governed data product using your own data, all within your 30-day Dremio Cloud trial.
In the Getting Started guide, you saw how Dremio's AI Agent can take you from question to insight within minutes using sample data. With this guide, you'll connect your data, prepare and transform it, build reusable views with semantics, and deliver insights using Dremio's AI Agent or your preferred tool of choice. The end goal is flexible: you might produce an aggregated view that analysts and AI agents query regularly, or a dashboard. Either way, you’ll experience the full value of Dremio Cloud as an agentic lakehouse — open, governed, and self-optimizing.
Prerequisites
Before you begin, ensure that you have the following:
- A Dremio Cloud account: You'll need an active Dremio Cloud trial account. If you haven't already, sign up at dremio.com/get-started for a 30-day trial with $400 in free credits.
- Access to data: Identify at least one data source you can connect to, such as object storage or databases. If you don't have these, then you will need a set of local files that you can upload to Dremio Cloud.
- (Optional) Access to your BI tool: If your use case requires a dashboard, you will need access to your BI tool. This is optional, as you can use Dremio's AI Agent to generate basic charts to visualize trends directly within the Dremio console.
Step 1: Identify a Business Use Case
Begin by identifying a business use case that you will be implementing using this guide. The use case you choose should have clear value and measurable results, not a massive data project but a business question that matters.
How to Do It
Pick a concrete business question, such as:
- How are customer support metrics trending this quarter?
- Which product lines are driving margin growth?
- What are our top churn risks by region?
We recommend that you select a business question that can be answered using a few datasets, going across a maximum of two sources.
Step 2: Add Your Data
To implement the data model that answers the business question you identified in the Identify a Business Use Case section, you will first add your data to the project in your Dremio Cloud account.
How to Do It
You can add data to your project in one of three ways:
Load Data into the Open Catalog: Dremio provides a default Open Catalog, powered by Apache Polaris. You can load data directly into this catalog as an Iceberg table in your silver layer using your tool of choice, such as Fivetran, dbt, and Airbyte. From there, Dremio manages all Iceberg table metadata and governance while keeping your data in an open format. For instructions on how to load data into your Open Catalog, see Load Data into Tables.
Connect an Existing Source: Connect your object store, catalogs, databases so Dremio can query data in place. This is your bronze layer of data. For a list of supported sources and step-by-step connection instructions, see Connect to Your Data.
Upload Local Files: Upload local files (CSV, JSON, or Parquet) for quick exploration if you don't have direct access to your data sources. Dremio will write the uploaded data into an Iceberg table in your project's Open Catalog. For step-by-step instructions on how to upload files, see Upload Local Files.
Whichever method you choose, Dremio provides live, federated access to all of your data. This flexibility allows you to move from data connection to analysis in minutes.
Step 3: Clean and Transform Data
Dremio lets you prepare data from across different sources without having to move it. You're able to use natural language to generate the SQL using Dremio's AI Agent or write SQL yourself. Your data preparation steps can be represented as views; no additional pipelines are required.
How to Do It
Use SQL and AI Functions: Prepare and transform data using SQL Functions. You can also turn unstructured data, such as images or PDFs, into a structured, governed Iceberg table using AI Functions.
Use Dremio's AI Agent: Ask the built-in AI Agent to identify issues with the data and generate SQL to prepare and transform it. For example, you can ask the AI Agent to:
- Generate SQL to remove null values in the revenue column.
- Generate SQL to join orders and customers on customerID.
- Add a column for gross margin = revenue - cost.
Each transformation can be saved as a view in your silver layer, giving you reusable building blocks. This way, your transformations are continuously updated as more data comes in with no additional changes required from you. This approach replaces complex ETL pipelines with a simple workflow that keeps your data fresh, governed, and easy to iterate on. For instructions on how to create views, see Create a View.
Step 4: Build Views for Aggregations and Metrics
Once you've created your silver layer by cleansing and transforming your data, you can create your gold layer of views. These views will capture aggregations and metrics and will be ready for exploration, ad-hoc analysis, or dashboards.
How to Do It
Use SQL Functions: Aggregate and build out metrics using SQL Functions.
Use Dremio's AI Agent: Ask the built-in AI Agent to generate the SQL for your view. For example, you can ask the agent to Give me the SQL for views that summarize the average response time by call center employees and the customer sentiment by region.
Aggregations and metrics are saved as governed views in your Open Catalog. For instructions on how to create views, see Create a View.
Step 5: Add Semantics to Views
Data only becomes valuable when everyone can interpret it in the same way. The AI Semantic Layer gives your datasets shared meaning, so when an analyst or AI Agent is looking at "fiscal Q2" or "positive sentiment", they're applying the same business logic every time.
How to Do It
Enrich Your Data with Semantics: Generate wikis and labels on your views to reduce the amount of time being spent on manual tasks. For more information on generating semantics, see Generate Wikis and Labels.
You can add additional context, such as usage notes, definitions specific to your industry, and common queries. These definitions and classifications are stored with the data, guiding both natural language queries, SQL generation, and manual exploration.
Step 6: Deliver Insights
Now that you have connected, curated, aggregated, and enriched your data, you can deliver on the outcome for the business question you defined in Step 1. The outcome may be the aggregated view you created in the previous step that teams and agents will use directly, or it may be a dashboard that tracks metrics over time. With Dremio Cloud, you're able to deliver on either one.
How to Do It
Use Dremio's AI Agent for Actionable Insights: Dremio's AI Agent can analyze patterns and trends directly from views. You and your users can ask the business question you identified in Step 1, along with other questions. The AI Agent will use the semantics and samples of the data to generate the appropriate SQL queries that provide you with insights and visualizations of the data. For example, on sales data, you can ask the AI Agent to Create a chart to show the trends in sales across regions over the last year and provide an analysis on the changes.
Create a Dashboard Using Your Tool of Choice: If you already have a dashboard or report that you would like to update or you want to create a new one to represent the insights on your data, you can connect to Dremio from tools like Tableau, Microsoft Power BI, and others using Flight SQL JDBC/ODBC connections. For a list of supported tools and step-by-step instructions on connecting, see Connect Client Applications.
Step 7: Operationalize the Use Case
Each use case is operationalized when it's governed, monitored, and shareable.
How to Do It
Access Control Policies: Create and implement access control policies from role-based access to more granular row and column-level policies. For more information, see Privileges and Row-Access and Column-Masking Policies.
Monitor Query Volumes and Performance: Track performance and usage of the data. Dremio's Autonomous Management capability automatically handles data management and ensures reliable and fast query performance. In Dremio, you're able to monitor queries and their performance.
Cost Management: Review consumption and spend of this use case within the Dremio console. These dashboards show how much compute and storage each workload consumes, helping you plan budgets, optimize workloads, and estimate spend before moving to production. For more information, see Usage.
Operationalizing your first use case ensures it remains reliable, governed, and cost-effective. You gain insight into both performance and consumption trends, enabling you to scale confidently while maintaining control of your budget.
Wrap Up and Next Steps
You've now implemented your first use case on Dremio Cloud by:
- Defining a valuable business use case
- Adding your own data to your Open Catalog or by connecting existing data sources
- Cleaning and transforming the data
- Creating reusable views with semantics
- Delivering insights via AI or dashboards
- Operationalizing the data through governance and monitoring
Next, extend your use case with additional business questions or another business domain.
Related Topics
- Dremio MCP Server - Use Dremio's hosted MCP server to customize your agentic workflow.
- Visual Studio Code - Use the Visual Studio (VS) Code extension for Dremio for development and analysis.
- Optimize Performance - Learn about how Dremio autonomously optimizes performance.