Managing Your Data
When you work on the Datasets page, there are different components that you can use to manage your data. The largest component is the Data panel, which is used to explore the spaces and sources in your data catalog, as shown in this image:
|1||By default, you have a home space that you can further organize by creating a hierarchy of folders, and you can create additional spaces.|
|2||A space is a directory in which views are saved. Spaces provide a way to group datasets by common themes such as a project, purpose, department, or geographic region.|
|3||A source is a data lake or external source (such as a relational database) that you can connect to Dremio Sonar. Sources include Nessie repositories, data lakes, and external sources.|
|4||The title indicates that the Samples data lake is open and lists the contents of the sample source. A source also consists of layers, so if you expand a data source, you will find datasets and data types within the datasets.|
|5||The datasets are listed from the Samples data lake. A dataset is a collection of data. The datasets stored in files can be in many different formats, and to run SQL queries against data in different formats, you can create tables and views. A table contains the data from your source, formatted as rows and columns. A view is a virtual table, created by running SQL statements or functions on a table or another view.|
Adding Data Objects
By opening the SQL Runner, you can directly add data objects from the Data panel into the SQL editor.
To add the data object, locate the object in the Data panel that you would like to use within the query. Click the + button or drag and drop the data into the SQL editor.
Starring Data Objects
When using the SQL Runner on the Datasets page, you can star spaces, sources, datasets, and other objects in your data catalog, which adds the object to your Starred list for easier access. The Starred list can hold up to 25 entities at a time, and each starred item remains on the list even if you open a new browser or clear the cache.
To star a data object:
- In the Data panel, locate the data object that you want to star. In this example, a folder in a source is being starred.
- Click the (Star) icon that appears next to the data object. The data object will appear on your Starred list.
To unstar a data object, click the Star icon again.
- The starring option is not available for datasets in the Nessie repository.
- Starring is different than pinning items. You can only pin spaces and sources on the Datasets page, and pinned items are not saved if you open a new browser or clear the cache.
The Scratch Directory
Scratch is a directory in your Amazon S3 bucket in which tables are saved. Scratch provides a way for you to create tables as a Parquet file, which can be formatted to a table. For more information, see Scratch Directory.