Skip to content

fix: coerce operand types in Interval mul/div/intersect/union/contains#22027

Open
adriangb wants to merge 2 commits intoapache:mainfrom
adriangb:fix-interval-decimal-mismatch
Open

fix: coerce operand types in Interval mul/div/intersect/union/contains#22027
adriangb wants to merge 2 commits intoapache:mainfrom
adriangb:fix-interval-decimal-mismatch

Conversation

@adriangb
Copy link
Copy Markdown
Contributor

@adriangb adriangb commented May 5, 2026

Which issue does this PR close?

  • Closes #.

Rationale for this change

Interval::mul, Interval::div, Interval::intersect, Interval::union, and Interval::contains all asserted that both operands had identical data types. This causes internal errors during interval propagation for ordinary SQL queries that mix Decimal128 precisions/scales — for example dividing numeric (which DataFusion maps to Decimal128(38, 10)) by a BIGINT column or count(*) (coerced to Decimal128(20, 0)):

Internal error: Assertion failed: dt.clone() == rhs_type.clone()
  (left: Decimal128(38, 10), right: Decimal128(20, 0)):
  Intervals must have the same data type for division

Interval::add and Interval::sub already used BinaryTypeCoercer to find a common arithmetic type; this PR brings mul and div in line. Once mul/div coerce, the result of an arithmetic op fed into intersect (e.g. by the CP solver in cp_solver.rs) may have a different type than the child interval, so intersect/union/contains are also relaxed to coerce via comparison_coercion (matching the existing pattern in Interval::contains_value).

Reproducer (datafusion-cli, before this PR)

CREATE TABLE t(c1 BIGINT) AS VALUES (1::bigint), (2::bigint), (10::bigint);

SELECT
  c1,
  CASE WHEN c1 = 0 THEN 100.0
       ELSE ROUND((1.0 - (c1::numeric / c1)) * 100, 2)
  END AS rate
FROM t
WHERE (1.0 - (c1::numeric / c1)) * 100 < 95.0;

fails with the internal-error message above. After this PR the query returns rows.

What changes are included in this PR?

  • Replace the type-equality asserts in Interval::mul and Interval::div with a new coerce_operands helper that uses BinaryTypeCoercer::get_result_type and casts both intervals to the common type when they differ.
  • Replace the type-equality asserts in Interval::intersect, Interval::union, and Interval::contains with a new coerce_for_comparison helper that uses comparison_coercion and casts both intervals to the common type when they differ.
  • Update the doc comments on the affected methods to document the new coercion behavior.

The same-type fast path is preserved (no allocation/cast when both intervals already share a type), so this should be a no-op for existing call sites.

Are these changes tested?

Yes:

  • New unit test test_mul_div_mismatched_operand_types in interval_arithmetic.rs exercising Decimal128(38, 10) / Decimal128(20, 0) and Decimal128 × Int64 for both mul and div.
  • New regression query in decimal.slt covering the numeric / bigint shape from the reproducer above through the optimizer's interval-propagation path. Without this fix the SLT query fails with an internal error; with it, it returns rows.
  • Existing datafusion-expr-common, datafusion-physical-expr, datafusion-physical-plan, and full sqllogictests suites all pass.

Are there any user-facing changes?

Queries that previously hit Internal error: Intervals must have the same data type ... (or the equivalent for intersect/union/contains) during interval/stats propagation will now succeed. No public API changes — these methods kept their signatures; the documented preconditions are relaxed.

🤖 Generated with Claude Code

@adriangb
Copy link
Copy Markdown
Contributor Author

adriangb commented May 5, 2026

@pepijnve @neilconway any interest in reviewing this fix?

Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adriangb
I did not find any blocking issues. I left one small suggestion around expanding the direct interval arithmetic coverage.

/// If the two intervals have different data types, both are coerced to a
/// common comparison type via [`comparison_coercion`] before checking
/// containment.
pub fn contains<T: Borrow<Self>>(&self, other: T) -> Result<Self> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement overall. Since this PR explicitly relaxes intersect / union / contains, it could be helpful to add a couple of focused unit tests that assert the actual bounds and boolean outcomes for mismatched Decimal128 precision/scale intervals.

Right now the new direct unit test only exercises mul / div and checks result types. Adding small cases for contains returning TRUE / FALSE / TRUE_OR_FALSE would help lock in the comparison coercion behavior at the API layer too, not just through the SQL regression tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in b0e6ae0

adriangb and others added 2 commits May 6, 2026 10:40
…`/`contains`

Previously these methods asserted that both intervals shared an identical
data type, causing internal errors during interval propagation for queries
like `numeric / count(*)` where the operands end up as different
`Decimal128` precisions/scales (e.g. `Decimal128(38, 10) / Decimal128(20, 0)`).

`Interval::add` and `Interval::sub` already used `BinaryTypeCoercer` to find
a common arithmetic type. This change brings `mul`/`div` in line, and
similarly relaxes `intersect`/`union`/`contains` to coerce via
`comparison_coercion`, since CP-solver propagation feeds the result of an
arithmetic op into `intersect` with a child interval whose type may differ.

Adds a unit test in `interval_arithmetic.rs` for mismatched `Decimal128`
mul/div, and an end-to-end `decimal.slt` regression covering the
`numeric / bigint` shape from the failing customer query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h mismatched Decimal128 types

Per PR review feedback, lock in the comparison-coercion behavior at the
API layer with focused tests asserting exact bounds and the TRUE /
FALSE / TRUE_OR_FALSE outcomes for `contains`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adriangb adriangb force-pushed the fix-interval-decimal-mismatch branch from 097259d to b0e6ae0 Compare May 6, 2026 16:34
Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@pepijnve
Copy link
Copy Markdown
Contributor

pepijnve commented May 7, 2026

the documented preconditions are relaxed

I think the change itself makes sense. There is also a change in postconditions of the functions though. Before the output type of the interval was guaranteed to match the input type. After that change that postcondition is less strict. I'm wondering if that could cause problems for existing code. Relaxing the preconditions will cause existing code to follow its success code path instead of its error code path. The change in postconditions may result in an assumption violation for those success code paths.

I don't know if this is mostly a theoretical issue, or something that requires actual consideration.

@adriangb
Copy link
Copy Markdown
Contributor Author

adriangb commented May 8, 2026

@pepijnve good question. The honest answer is that I don't like relaxing the precondition, and I'm not familiar enough with this part of the code to be 100% sure it will not cause any issues. My only argument (which I do think is a strong one) for this being mostly a theoretical concern is that the existing add and subtract operations already Interval::add and Interval::sub already handled this via BinaryTypeCoercer and thus would have suffered from the same type mismatch problem. Since there have been no issues reported about this afaik maybe it's safe to assume it's a theoretical problem? In other words: this PR is just making the behavior uniform, not introducing new behavior.

I do worry that this statistics / constraint solving should not be crashing queries, we should fail gracefully and proceed without this info. I opened #22028 to track that, but it's beyond the scope of this fix I think.

@pepijnve
Copy link
Copy Markdown
Contributor

pepijnve commented May 8, 2026

My only argument (which I do think is a strong one) for this being mostly a theoretical concern

I agree this is most likely a non-issue and the fact that Interval was not internally consistent is probably worse. Would it be worth adding an upgrade note?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants