Data Types
Supported Data Types
Data Type | UI Symbol |
---|---|
BOOLEAN | |
VARBINARY | |
DATE | |
FLOAT | |
DECIMAL* | |
DOUBLE | |
INTERVAL (day to seconds) | |
INTERVAL (years to months) | |
INT | |
BIGINT | |
TIME | |
TIMESTAMP | |
VARCHAR** | |
MAP | |
LIST |
* Partially supported
** VARCHAR
type only supports UTF-8 encoded values
Decimal Support
As of Dremio 4.0, decimal to decimal mappings are supported for relational database sources. That is, when creating a new relational database data source, the default is to map decimal to decimal. If you do not want the default decimal-to-decimal mapping, you can enable decimal-to-double mappings via the data source Advanced Options tab.
[info] Upgrade Impact
When you upgrade a relational data source to Dremio 4.0 with decimal usage, the existing (pre-4.0) behaviour of mapping decimals to doubles is retained.
See Installing and Upgrade for more information.
Dremio now fully supports Decimal data type for Parquet, Hive(Parquet/ORC) sources. Text/Json sources can cast strings/integers to decimals for decimal precision and scale. Other sources treat decimals as doubles same as prior to 3.3.5.
Return Type Inference
The return type inference provides the return type for functions and operators that natively handle decimal types. Functions and operators that downcast instead return the data type in the existing function (double/float/int).
Dremio supports the following return type inference rules:
Arithmetic Operators
Function/Operator | Native Decimal Support |
---|---|
+ (Addition) - (Subtraction) |
s = max(s1, s2) p = max(s1, s2) + max(p1 - s1, p2 - s2) + 1 if (p > 38) {
minS = min(s, 6)
s = s - (p - 38)
s = max(minS, s)
P = 38
}
|
* (Multiplication) |
s = s1 + s2 p = p1 + p2 + 1 if (p > 38) {
minS = min(s, 6)
s = s - (p - 38)
s = max(minS, s)
p = 38
}
|
/ (Division) | s = max(6, s1 + p2 + 1) p = p1 -s1 + s2 + s if (p > 38) {
minS = min(s, 6)
s = s - (p - 38)
s = max(minS, s)
p = 38
}
|
% (Modulus) | s = max (s1, s2) p = min (p1 - s1, p2 - s2) + s if (p > 38) {
minS = min(s, 6)
s = s - (p - 38)
s = max(minS, s)
p = 38
}
|
Mixed Types
Function/Operator | Native Decimal Support |
---|---|
Mixed Types | s = max(s1, s2) p = max(s1, s2) + max(p1 - s1, p2 - s2) if (p > 38) {
Error out and reject.
} |
SQL Clauses
Function/Operator | Native Decimal Support |
---|---|
ORDER BY | Same as input precision and scale |
GROUP BY | Same as input precision and scale |
Math Functions
Function/Operator | Native Decimal Support |
---|---|
ABS() | Same as input precision and scale |
CEILING() | Same as input precision and scale |
FLOOR() | Same as input precision and scale |
MOD() | Same as Modulus above |
ROUND() | Same as input precision and scale |
TRUNC() | Same as input precision and scale |
Aggregate Functions
Function/Operator | Native Decimal Support |
---|---|
AVG() | Decimal(38,s) / Decimal(18,0) |
MAX() | Same as input precision and scale |
MIN() | Same as input precision and scale |
SUM() | Decimal(38, input Scale) |
Analytic Functions
Function/Operator | Native Decimal Support |
---|---|
LAG() | Same as input precision and scale |
LEAD() | Same as input precision and scale |
Type Conversion Functions
Function/Operator | Native Decimal Support |
---|---|
CAST() | Provided in cast type |
CONVERT_FROM()) | Provided in cast type |
CONVERT_TO() | Provided in cast type |
Set Operators
Function/Operator | Native Decimal Support |
---|---|
UNION() | TBD |
UNION ALL() | TBD |
INTERSET() | TBD |
MINUS() | TBD |
Mixed Decimal Types
The following decimal union types specifies when mixed decimal types are supported:
- Parquet -- Mixed decimal types in a data set are not supported. For example, the same column cannot have differing precision/scale across files in the same dataset.
- ORC -- Mixed decimal types in a data set are supported.
Decimal Limitations
The following are decimal limitations:
- Decimal numeric literals in SQL queries cannot be bigger than maximum possible long value.
Queries that do arithmetic on a column and literal have issues.
For example,
select cast(12345 as double) * CAST(A as double)
fails. The work around is to use a string value likeselect cast('12345' as double) * CAST(A as double)
.Queries casting numeric literals to decimal should use specific precision. Precision of literal cannot be lowered.
For example,
cast(123.23 as decimal(2,0))
returns the same number as output since given number cannot be represented using precision 2.In general, decimal overflow is not handled; output when decimal arithmetic overflows is undefined. This is the same as other data types in Dremio.
Timestamp Queries
When Dremio retrieves the timestamp value entry in a table, the returned timestamp entry is displayed as it was written to the file system. That is, Dremio does not apply any conversions to the timestamp entry.
For example:
- If the timestamp value entry is written to the file system in UTC, then Dremio displays the entry in UTC.
- With Hive, all timestamp value entries are converted and stored as UTC in the file system.
Subsequently, when Dremio returns the timestamp value entry, it is displayed in UTC. - With some other application, if the timestamp value entry is written and stored as PST (local time) in the file system, then when Dremio returns the timestamp value entry, it is displayed in PST.
Source Data Type Mappings
The following topics provide Dremio data type mappings to source database types.