Create a Table & Add Dataset Info

For this module, you will be working with the NYC-taxi-trips.csv file (330+ million rows) in an Amazon S3 bucket.

You will be creating a folder and a table in your catalog. In each of the steps, replace catalog_name with the name of the catalog you created in the previous step.

Step 1. Create a Folder

To create a folder where the table you will create will reside:

Click the SQL Runner icon in the side navigation bar.
Copy and paste the SQL below and replace catalog_name with the name of your catalog. Click Run.
Create folder in catalog
```
CREATE FOLDER "catalog_name"."my_folder";
```

Step 2. Create a Table

To create a table in your catalog:

Click the SQL Runner icon in the side navigation bar.

Copy and paste the SQL below and click Run.

Create table in folder

CREATE TABLE "catalog_name"."my_folder"."nyc_trips" (
  pickup_datetime TIMESTAMP,
  passenger_count INT,
  trip_distance_mi FLOAT,
  fare_amount FLOAT,
  tip_amount FLOAT,
  total_amount FLOAT
);

To populate the nyc_trips table with sample data, run the SQL below.

Populate table with data

COPY INTO "catalog_name"."my_folder"."nyc_trips"
FROM '@Samples/samples.dremio.com/' FILES('NYC-taxi-trips.csv');

To query the populated data, run the SQL below.
Query table
```
SELECT * FROM "catalog_name"."my_folder"."nyc_trips";
```

Now that you've created your first table in the catalog, you can accelerate your queries on this table using Reflections.

Step 3. Add Dataset Info to Enhance Discoverability

For users who will be querying and building on the table that you created, add a markdown description and label to help them understand the data.

Browse to the Datasets page and click on my_folder in the upper left corner.
In my_folder, hover over the nyc_trips table that you created and click on the right. Dremio takes you to the Edit Dataset page for nyc_trips, which is indicated by the name and path in the top left.
In the Details tab, on the Dataset Overview panel on the right, click to add a label that will help users identify the dataset, type public-data, and press Enter.

In the Details tab, in the Wiki section, click Edit wiki to add markdown to describe the table and provide helpful content for the user to get started with the data (i.e., examples, usage notes, point of contact for questions or issues). Copy and paste the example below or write your own and click Save.

Example wiki

This table contains taxi trips data from NYC. It can be used to determine the taxi trends i.e. frequency of trips, distance traveled, and average fares or tips.

Questions? Contact gnarly@dremio.com

### Columns
`pickup_datetime`: Time at which the passenger was picked up.
`passenger_count`: The number of passengers that were picked up for the taxi ride.
`trip_distance_mi`: The total distance of the trip from pickup to drop off locations in miles.
* Max: 351.0
`fare_amount`: The dollar amount that the ride cost.
* Max: 158995.81
`tip_amount`: The dollar amount that the customer tipped in addition to the fare_amount.
* Max: 888.19
`total_amount`: The total dollar amount that the customer paid (fare + tip).

### Examples
`SELECT  DATE_PART('DAY', "catalog_name"."my_folder"."nyc_trips"."pickup_datetime") AS "pickup_day",
      SUM("catalog_name"."my_folder"."nyc_trips"."passenger_count") AS "passenger_count"
FROM    "catalog_name"."my_folder"."nyc_trips"
GROUP BY pickup_day
ORDER BY pickup_day;`

Now your users can easily understand and query the my_folder.nyc_trips table you created.

Step 1. Create a Folder​

Step 2. Create a Table​

Step 3. Add Dataset Info to Enhance Discoverability​

Next: Accelerate Queries Using Reflections​

Step 1. Create a Folder

Step 2. Create a Table

Step 3. Add Dataset Info to Enhance Discoverability

Next: Accelerate Queries Using Reflections