Data Types

Supported Data Types

Data Type UI Symbol
BOOLEAN
VARBINARY
DATE
FLOAT
DECIMAL*
DOUBLE
INTERVAL (day to seconds)
INTERVAL (years to months)
INT
BIGINT
TIME
TIMESTAMP
VARCHAR**
MAP
LIST

* Partially supported
** VARCHAR type only supports UTF-8 encoded values

Decimal Support

As of Dremio 4.0, decimal to decimal mappings are supported for relational database sources. That is, when creating a new relational database data source, the default is to map decimal to decimal. If you do not want the default decimal-to-decimal mapping, you can enable decimal-to-double mappings via the data source Advanced Options tab.

[info] Upgrade Impact

When you upgrade a relational data source to Dremio 4.0 with decimal usage, the existing (pre-4.0) behaviour of mapping decimals to doubles is retained.

See Installing and Upgrade for more information.

Dremio now fully supports Decimal data type for Parquet, Hive(Parquet/ORC) sources. Text/Json sources can cast strings/integers to decimals for decimal precision and scale. Other sources treat decimals as doubles same as prior to 3.3.5.

Return Type Inference

The return type inference provides the return type for functions and operators that natively handle decimal types. Functions and operators that downcast instead return the data type in the existing function (double/float/int).

Dremio supports the following return type inference rules:

Arithmetic Operators
Function/Operator Native Decimal Support
+ (Addition)
- (Subtraction)
s = max(s1, s2)
p = max(s1, s2) + max(p1 - s1, p2 - s2) + 1

if (p > 38) { minS = min(s, 6) s = s - (p - 38) s = max(minS, s) P = 38 }
* (Multiplication) s = s1 + s2
p = p1 + p2 + 1

if (p > 38) { minS = min(s, 6) s = s - (p - 38) s = max(minS, s) p = 38 }
/ (Division) s = max(6, s1 + p2 + 1)
p = p1 -s1 + s2 + s

if (p > 38) { minS = min(s, 6) s = s - (p - 38) s = max(minS, s) p = 38 }
% (Modulus) s = max (s1, s2)
p = min (p1 - s1, p2 - s2) + s

if (p > 38) { minS = min(s, 6) s = s - (p - 38) s = max(minS, s) p = 38 }
Mixed Types
Function/Operator Native Decimal Support
Mixed Types s = max(s1, s2)
p = max(s1, s2) + max(p1 - s1, p2 - s2)

if (p > 38) { Error out and reject. }
SQL Clauses
Function/Operator Native Decimal Support
ORDER BY Same as input precision and scale
GROUP BY Same as input precision and scale
Math Functions
Function/Operator Native Decimal Support
ABS() Same as input precision and scale
CEILING() Same as input precision and scale
FLOOR() Same as input precision and scale
MOD() Same as Modulus above
ROUND() Same as input precision and scale
TRUNC() Same as input precision and scale
Aggregate Functions
Function/Operator Native Decimal Support
AVG() Decimal(38,s) / Decimal(18,0)
MAX() Same as input precision and scale
MIN() Same as input precision and scale
SUM() Decimal(38, input Scale)
Analytic Functions
Function/Operator Native Decimal Support
LAG() Same as input precision and scale
LEAD() Same as input precision and scale
Type Conversion Functions
Function/Operator Native Decimal Support
CAST() Provided in cast type
CONVERT_FROM()) Provided in cast type
CONVERT_TO() Provided in cast type
Set Operators
Function/Operator Native Decimal Support
UNION() TBD
UNION ALL() TBD
INTERSET() TBD
MINUS() TBD

Mixed Decimal Types

The following decimal union types specifies when mixed decimal types are supported:

  • Parquet -- Mixed decimal types in a data set are not supported. For example, the same column cannot have differing precision/scale across files in the same dataset.
  • ORC -- Mixed decimal types in a data set are supported.

Decimal Limitations

The following are decimal limitations:

  • Decimal numeric literals in SQL queries cannot be bigger than maximum possible long value.
  • Queries that do arithmetic on a column and literal have issues.

    For example, select cast(12345 as double) * CAST(A as double) fails. The work around is to use a string value like select cast('12345' as double) * CAST(A as double).

  • Queries casting numeric literals to decimal should use specific precision. Precision of literal cannot be lowered.

    For example, cast(123.23 as decimal(2,0)) returns the same number as output since given number cannot be represented using precision 2.

  • In general, decimal overflow is not handled; output when decimal arithmetic overflows is undefined. This is the same as other data types in Dremio.

Timestamp Queries

When Dremio retrieves the timestamp value entry in a table, the returned timestamp entry is displayed as it was written to the file system. That is, Dremio does not apply any conversions to the timestamp entry.

For example:

  • If the timestamp value entry is written to the file system in UTC, then Dremio displays the entry in UTC.
  • With Hive, all timestamp value entries are converted and stored as UTC in the file system.
    Subsequently, when Dremio returns the timestamp value entry, it is displayed in UTC.
  • With some other application, if the timestamp value entry is written and stored as PST (local time) in the file system, then when Dremio returns the timestamp value entry, it is displayed in PST.

Source Data Type Mappings

The following topics provide Dremio data type mappings to source database types.


results matching ""

    No results matching ""