Wikis and Labels
Wikis and labels help users document, organize, and discover datasets within the Open Catalog. This page explains how to manage wikis and labels, as well as how Dremio’s Generative AI features can assist in generating wikis and labels for you.
Wikis
Wikis for datasets provide an efficient way to document and describe datasets within the Open Catalog. These wikis enable users to add comprehensive information, context, and relevant details about the datasets they manage. With a user-friendly, rich text editor, the wikis support Github-flavored markdown, allowing users to format content easily and enhance readability. Wikis ensure that dataset documentation is both accessible and structured, making it simpler for teams to understand the datasets and how to work with them effectively.
Manage Wikis
Ensure you have sufficient Role-Based Access Control (RBAC) privileges to view or edit wikis.
To view or edit the wiki for a dataset in the Dremio console:
- On the Datasets page, navigate to the folder where your dataset is stored.
- Hover over your dataset, and on the right-hand side, click the
icon. - Click Open Details Panel.
- You can edit the dataset wiki by clicking Edit Wiki, writing your wiki content, and clicking Save.
Labels
Labels for datasets offer a powerful way to organize and retrieve datasets within a data catalog. By creating and assigning labels to datasets, users can easily search and filter through large collections related datasets. Labels also enhance the search experience, allowing users to quickly locate datasets associated with a specific label. By clicking on a label, users can initiate a search that brings up all datasets linked to that label, streamlining the process of finding relevant data and improving overall data management.
The following image shows a dataset in the catalog with several label and a brief wiki. In this example, the label "pii-data" was used in the search field to narrow down on a customer dataset that contains Personally Identifiable Information (PII).
Manage Labels
Ensure you have sufficient Role-Based Access Control (RBAC) privileges to view or edit labels.
To view or edit the labels for a dataset in the Dremio console:
- On the Datasets page, navigate to the folder where your dataset is stored.
- Hover over your dataset, and on the right-hand side, click the
icon. - Click Open Details Panel.
- You can add a label by clicking on the
icon, typing a label name (e.g. PII), and clicking Enter.
- You can add a label by clicking on the
Generate Labels and Wikis Preview
To help eliminate the need for manual profiling and cataloging, you can use Generative AI to generate labels and wikis for your datasets.
If you haven't opted into the Generative AI features, see Dremio Preferences for the steps on how to enable.
Generate Labels
In order to generate a label, Generative AI bases its understanding on your schema by considering other labels that have been previously generated and labels that have been created by other users.
To generate labels:
-
Navigate to either the Details page or Details Panel of a dataset.
-
In the Dataset Overview on the right, click
to generate labels. -
In the Generating labels dialog, review the labels generated for the dataset and decide which to save. If multiple labels have been generated, you can save some, all, or none of them. To remove, simply click the x on the label.
-
Complete one of the following actions:
-
If these are the only labels for your dataset, click Save.
-
If you already have labels for the dataset and want to add these generated labels, click Append.
-
If you already have labels for the dataset and want to replace them with these generated labels, click Overwrite.
The labels for the dataset will appear in the Dataset Overview.
-
Generate Wikis
In order to generate a wiki, Generative AI bases its understanding on your schema and data to produce descriptions of datasets, because it can determine how the columns within the dataset relate to each other and to the dataset as a whole.
You can generate wikis only if you are the dataset owner or have ALTER privileges on the dataset.
To generate a wiki:
-
Navigate to either the Details page or Details Panel of a dataset.
-
In the Wiki section, click Generate wiki. A dialog will open and a preview of the wiki content will generate on the right of the dialog. If you would like to regenerate, click
.
-
Click
to copy the generated wiki content on the right of the dialog. -
Click within the text box on the left and paste the wiki content.
-
(Optional) Use the toolbar to make edits to the wiki content. If you would like to regenerate, click
in the toolbar to regenerate wiki content in the preview. -
Click Save.
The wiki for the dataset will appear in the Wiki section.
Related Topics
- Search for Dremio Objects and Entities - Explore Dremio's semantic search capabilities.
- Data Privacy - Learn more about Dremio's data privacy practices.