1

I'm working with DuckDB and have several client-provided SQL expressions that use DECIMAL(38,10) columns (fixed precision with 10 digits after the decimal point).

For example:

SELECT S1__AMOUNT * S1__PRICE * S1__UNITS * 1000

All columns like S1__AMOUNT, S1__PRICE, etc. are DECIMAL(38,10).

When multiplying several of these columns (especially three or more) and then multiplying by a constant (e.g. * 1000), I get:

duckdb.duckdb.OutOfRangeException: Out of Range Error: 
Overflow in multiplication of DECIMAL(38) (194586756000000000000000000000000000 * 1000).
You might want to add an explicit cast to a decimal with a smaller scale.

Limitations:

I can not cast to DOUBLE, but after each operation it is allowed to cast its result to decimal(38,10)

I know that I can manually rewrite SQL clauses to cast after each step like:

CAST(
  CAST(
    CAST(S1__AMOUNT * S1__PRICE AS DECIMAL(38,10))
    * S1__UNITS AS DECIMAL(38,10)
  )
  * 1000 AS DECIMAL(38,10)
)

But since SQL clauses are written by client and can be pretty random I do not want to use this clumsy and bug prone way.

The question:

Is there any way to configure DuckDB so that it:

  • Automatically reduces intermediate decimal scale/precision if needed to fit inside DECIMAL(38,10), without throwing an OutOfRangeException, or
  • Automatically casts intermediate arithmetic results back to a safe DECIMAL(38,10), or
  • Provides an expression/function to safely multiply with overflow-safe decimal promotion?

If not, is the only reliable approach to rewrite all expressions and insert explicit casts after every multiplication/division?

Exact code to reproduce the problem:

import duckdb
import polars as pl

df = pl.DataFrame({"AMOUNT": 8760, "PRICE": 22.2131, "RATE": 1})
df = df.cast(pl.Decimal(scale=10))

result = duckdb.sql("""
FROM df
SELECT AMOUNT * PRICE * RATE * 1000
""").pl()

print(result)
8
  • How about fixing the real problem - too big data value/too small scale? A double is smaller than decimal. It only seems to work because it starts discarding the least significant digits. The value you're using, 194586756000000000000000000000000000, has 36 digits while the type you specified accepts 28. Had you used eg decimal(38) you'd have no problem with that, but *1000 would still exceed this. Commented Oct 29 at 11:34
  • Actually the value before multiplying by 1000 is 194586.75600000000000000000000… (and has a type decimal(38,10), duckdb prints it the wrong way in exception (without comma) Commented Oct 29 at 11:42
  • 2
    Edit the question and add enough code so others can reproduce the problem, without having access to your database Commented Oct 29 at 11:59
  • 1
    As your example used already used .pl() - I just inlined an input Polars dataframe instead of a parquet file. It produces the same exception for me, but you may want to double check and amend if necessary. Commented Oct 29 at 12:39
  • 1
    This is the behaviour required by the SQL standard (specifically ISO/IEC 9075-2:2023 section 6.30 <numeric value expression>, syntax rule 1(c)(iii) "The precision of the result of multiplication is implementation-defined (IV136), and the scale is S1 + S2.") Commented Nov 1 at 11:44

1 Answer 1

1

The question doesn't show anything that reproduces the problem without access to the poster's computer and data.

This does though, but the reason isn't obvious from the error message. This is a DuckDB quirk that doesn't happen in other databases:

create table x1(amount decimal(38,10),price decimal(38,10), units decimal(38,10));
insert into x1 values (194586.7560,1000,1000);

select *,amount*price*units as total from x1;

This works though :

create table x1(amount decimal(38,8),price decimal(38,8), units decimal(38,8));
insert into x1 values (194586.7560,1000,1000);

select *,amount*price*units as total from x1;

BUT total's type is DECIMAL(38,24) :

Total type is decimal(38,24).

In math multiplying two decimals produces a new one whose fractional digit count is the sum of the originals' fractional digits. So multiplying 3 decimal(x,8) produces a decimal(x,24). The decimal/number type must maintain the fractional digits, which means there aren't enough integral digits left in decimal(38,30) to hold the 15-digit result.

SQL is essentially a strongly typed language and the actual result types are determined when the SQL text is compiled in an execution plan. When the query gets compiled, all the database knows is that it has to maintain 30 fractional digits.

This can be fixed by using sensible types for each column. The price may be in Bitcoin, but does units really need 8 or 10 decimals? Does amount need decimal(38,10) ?

Casting to double is not a solution, because it can result to scaling errors even in the integral digits.

This

select format('{:f}',cast(194586756111111111111111111111111111 as double));

due to scaling issues, produces this.

194586756111111117667940568708153344.000000

Other databases deal with this differently. SQL Server will return a decimal(38,6) result for example. This :

exec sp_describe_first_result_set N'select *,amount *price *units *1000  as total  from #x1;'

returns

...name    system_type_name

...amount   decimal(38,10)
...price    decimal(38,10)
...units    decimal(38,10)
...total    decimal(38,6)
Sign up to request clarification or add additional context in comments.

4 Comments

"In math multiplying two decimals produces a new one whose fractional digit count is the sum of the originals' fractional digits.", formally, it is because that is what the SQL standard requires it for exact numeric multiplication (ISO/IEC 905-2:2023 section 6.30); sure it is grounded in common rules in math, but the standard could have applied a different rule if they had wanted to.
BTW, this isn't a quirk of DuckDB. The problem is that the resulting number exceeds the storage of DECIMAL(38, 30). For example with Firebird you get the same: dbfiddle.uk/Ed8wIBAL (numeric overflow).
So that's a quirk of Firebird too. I already showed that this doesn't happen in SQL Server. As for the SQL standard .... those quirks are legendary, and the standard almost always trails the actual implementations. At one point IBM even tried to force 5=x because it was easier for their parser. No database follows the standard beyond the basic level and DuckDB .... let's say that unnest in SELECT is very, very quirky
No, I really mean it is not a quirk. Your example calculation by the standard must yield a DECIMAL(p, 30), where p is implementation-defined. In case of both DuckDB and Firebird, p = 38 (the maximum precision available for exact numerics in those database systems). The values you use in your example produce a result that is too big to fit, and thus results in a numeric overflow. That is entirely logical behaviour within those constraints, and thus not a "quirk".

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.