Percentile Functions

Dremio can compute k-th percentile on a column by using the PERCENTILE_DISC or PERCENTILE_CONT SQL functions.

Note:

This function may only be used with numeric data types in Dremio.

Percentile Functions

Function Direct Argument Type Aggregated Argument Type Return Type Description
PERCENTILE_CONT double precision double precision double precision Continuous Percentile: returns a value corresponding to the specified fraction in the ordering, interpolatring between adjacent input items if needed.
PERCENTILE_DISC double precision double precision double precision Discrete Percentile: returns the first input value whose position in the ordering equals or exceeds the specified fraction.

PERCENTILE_DISC

This function computes a specific percentile for sorted values in a column. For each percentile value, PERCENTILE_DISC sorts the values using the ORDER BY clause. The function then returns the value with the smallest CUME_DIST value given that is greater or equal to the percentile value.

For example, PERCENTILE_DISC (0.5) computes the 50th percentile (i.e., the median) of an expression. The result is equal to a specific column value.

This information was originally provided via Microsoft’s Transact-SQL Reference Guide.

Syntax

PERCENTILE_DISC ( fraction )   
    WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC ] )

PERCENTILE_CONT

This function computes a percentile based on a continuous distribution of the column value. The result is interpolated and thus may not be equal to any of the actual values in the column.

This information was originally provided via Microsoft’s Transact-SQL Reference Guide.

Syntax

PERCENTILE_CONT ( fraction )   
    WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC ] )

Arguments

fraction

The fraction/percentile value to compute. The value for this must be a numeric literal in the range of 0 to 1 inclusive and represents a percentage.

WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC )

Specifies the values to sort and compute the percentile. You can only provide one expression in the ORDER BY clause. By default, the sort order is ascending (ASC).

Additional Information

Null values in the data set are ignored. However any null values in the command (null fraction or null order_by_expression) will throw an exception in Dremio.