Skip to content

refac: Harmonize linopy operations with breaking convention#591

Draft
FBumann wants to merge 6 commits intoharmonize-linopy-operationsfrom
harmonize-linopy-operations-mixed
Draft

refac: Harmonize linopy operations with breaking convention#591
FBumann wants to merge 6 commits intoharmonize-linopy-operationsfrom
harmonize-linopy-operations-mixed

Conversation

@FBumann
Copy link
Collaborator

@FBumann FBumann commented Feb 20, 2026

This PR contains a sketch up for a new, stricter convention for arithmetics and constraint creation in linopy.

Its documented in the notebook.
The notebook itself showcases behaviour, but should be improved to guide trough both behaviour and how to achieve certain things (and migrate)

Tests were updated to pass, but are not well structured.
Ill continue as soons as i can.

About backwards compatability:

I vision to enable this behaviour via opt in for a start
`linopy.Model(join='new'), and warn users with a Future Warning if they dont, bt behaviour would not change.

THis is quite sensible.

FBumann and others added 5 commits February 20, 2026 09:01
Use "exact" join for +/- (raises ValueError on mismatch), "inner" join
for *// (intersection), and "exact" for constraint DataArray RHS.
Named methods (.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq())
accept explicit join= parameter as escape hatch.

- Remove shape-dependent "override" heuristic from merge() and
  _align_constant()
- Add join parameter support to to_constraint() for DataArray RHS
- Forbid extra dimensions on constraint RHS
- Update tests with structured raise-then-recover pattern
- Update coordinate-alignment notebook with examples and migration guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 20, 2026

@FabianHofmann Im quite happy with the notebook now. It showcases the convention and its consequences.
Tests need some work though. And migration as well.
Looking forward to your opinion on the convention

…ords. Here's what changed:

  - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"])
  - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...)
  - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful
  - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...)
  - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...)
  - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...)
  - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...)
  - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...)
  - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...)
  - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...)
  - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant
  - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Why "exact" instead of "inner" for * and /

"exact" still broadcasts freely over dimensions that only exist on one side — it only enforces strict matching on shared dimensions. So the common scaling pattern works fine:

cost = xr.DataArray([10, 20], coords=[("tech", ["wind", "solar"])])
capacity  # dims: (tech=["wind", "solar"], region=["A", "B"])

cost * capacity  # ✓ tech matches exactly, region broadcasts freely

"inner" is dangerous: if coords on a shared dimension don't match due to a typo or upstream change, it silently drops values. The explicit and safe way to subset before multiplying is:

capacity.sel(tech=["wind", "solar"]) * renewable_cost

No operation should introduce new dimensions

Neither side of any arithmetic operation should be allowed to introduce dimensions the other doesn't have. The same problem applies to + and - as to * and / — new dimensions silently expand the optimization problem in unintended ways:

cost_expr      # dims: (tech, time)
regional_expr  # dims: (tech, time, region)

cost_expr + regional_expr  # ✗ silently expands to (tech, time, region)

capacity  # dims: (tech, region, time)
risk      # dims: (tech, scenario)
risk * capacity  # ✗ silently expands to (tech, region, time, scenario)

An explicit pre-check on all operations:

asymmetric_dims = set(other.dims).symmetric_difference(set(self.dims))
if asymmetric_dims:
    raise ValueError(f"Operation introduces new dimensions: {asymmetric_dims}")

Summary

Operation Convention
+, -, *, / "exact" on shared dims; neither side may introduce dims the other doesn't have

@coroa
Copy link
Member

coroa commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Let's clearly differentiate between dimensions and labels.

labels

I agree with "exact" for labels by default, but we need an easy way to have inner or outer joining characteristics. I found the pyoframe conventions
strange at the beginning, but they grew on me:

x + y.keep_extras() to say that an outer join is in order and mismatches should fill with 0.

x + y.drop_extras() to say that you want an outer join.
x.drop_extras() + y does the same, though.

I have in a different project used | 0 to indicate keep_extras ie (x + y | 0).

dimensions

i am actually fond of the ability to auto broadcast over different dimensions. and would want to keep that (actually my main problem with pyoframe).

your first example actually implicitly assumes broadcasting.

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

Dimensions and broadcasting

I agree that auto broadcasting is helpful in some cases.
I'm happy with allowing broadcasting of constants. We could allow this always...?
But I would enforce that the constant never has more dims than the variable/expression.
Or is there a use case for this?

So the full convention requires two separate things:
1. "exact" join — shared dims must have matching coords (xarray handles this)
2. Subset dim check — the constant side’s dims must be a subset of the variable/expression (custom pre-check needed)

labels

I'm not sure if I like this approach, as it's needs careful state management of the flags on expressions. The flag (keep or drop extras) needs to be handled.
I would rather enforce to reindex or fill data to the correct index.
I think aligning is the correct approach:

import linopy

# outer join — fill gaps with 0 before adding
x_aligned, y_aligned = linopy.align(x, y, join="outer", fill_value=0)
x_aligned + y_aligned

# inner join — drop non-matching coords before adding
x_aligned, y_aligned = linopy.align(x, y, join="inner")
x_aligned + y_aligned

Combining disjoint expressions would then still need the explicit methods though.
I'm interested about your take on this

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

The proposed convention for all arithmetic operations in linopy:
1. "exact" join by default — shared coords must match exactly, raises on mismatch
2. Subset dim check — constants may introduce dimensions the variable/expression doesn’t have
3. No implicit inner join — use .sel() explicitly instead
4. Outer join with fill — use x + (y | 0) or .add(join="outer", fill_value=0)
The escape hatches in order of preference: .sel() for subsetting, | 0 for inline fill, named method .add(join=...) for everything else. No context manager needed.​​​​​​​​​​​​​​​​

I'm not sure how to implement the | operator yet. Might need some sort of flag/state for defered indexing

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

I thought about the pipe operator:
I think it should only work with linopy internal types (Variables/expression), not constants (scalar, numpy, pandas, dataarray), as this would need monkey patching a lot and hard to get stable.

Would this be an issue for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants