fix(engine): actionable error when a Jinja field is missing/None/empty#633
fix(engine): actionable error when a Jinja field is missing/None/empty#633nabinchha wants to merge 3 commits into
Conversation
Empty-render and missing-attribute failures used to surface as the generic "User provided prompt generation template is invalid." either because `sanitize_user_exceptions` stripped the detail or because Jinja's raw `UndefinedError` leaked through. Both now raise a new `EmptyTemplateRenderError` carrying a row-level diagnostic that names the offending chain and includes copy-pasteable Jinja conditional and SkipConfig fix patterns. Closes #629.
PR #633 Review —
|
Greptile SummaryThis PR fixes two distinct failure modes in Jinja template error reporting — both previously collapsed into a generic "template is invalid" message — by introducing a new
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/processing/ginja/exceptions.py | Adds EmptyTemplateRenderError as a UserTemplateError subclass; minimal change, correctly structured for the sanitizer's exception hierarchy ordering. |
| packages/data-designer-engine/src/data_designer/engine/processing/ginja/ast.py | New ast_extract_access_chains and resolve_access_chain helpers; logic for top-level chain extraction, dynamic-subscript skipping, and partial-resolution tracking is correct and well-tested. |
| packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py | UndefinedError handling and empty-render diagnostic wired into both sandbox environments; gate_accessors logic correctly handles the root-absent case; sanitizer ordering comment added for future maintainers. |
| packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py | Comprehensive regression tests for both bugs, all culprit-classification branches, sanitizer bypass, loop-variable scoping, empty-collection reporting, and safe gate generation including the root-absent regression case. |
| packages/data-designer-engine/tests/engine/processing/ginja/test_ast.py | Parametrized tests for ast_extract_access_chains and resolve_access_chain cover plain text, nested attrs, subscripts, dynamic subscripts, loop/conditional contexts, and failure/type-mismatch cases. |
| packages/data-designer-engine/tests/engine/column_generators/generators/test_image.py | Updated match string from 'invalid' to 'empty' to reflect the improved error message; correctly retains UserTemplateError as the expected type (EmptyTemplateRenderError is a subclass). |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["safe_render / render_template called"] --> B["template.render(record)"]
B --> C{Exception?}
C -- "UndefinedError" --> D["_build_empty_render_message(template, record, parse)"]
C -- "SecurityError / OverflowError" --> E["UserTemplateError (generic)"]
C -- No exception --> F["validate_rendered_text(rendered_text, template, record)"]
F --> G{rendered empty?}
G -- Yes --> D
G -- No --> H["return rendered_text ✓"]
D --> I["ast_extract_access_chains(ast)"]
I --> J["filter by meta.find_undeclared_variables"]
J --> K["_collect_culprit_chains(chains, record)"]
K --> L["resolve_access_chain per chain"]
L --> M{Culprits found?}
M -- Yes --> N["Build actionable message\nwith gate expression"]
M -- No --> O["Generic fallback message"]
N --> P["raise EmptyTemplateRenderError"]
O --> P
P --> Q["sanitize_user_exceptions wrapper"]
Q --> R{Exception type?}
R -- "EmptyTemplateRenderError\nor UnsupportedFiltersError" --> S["re-raise as-is ✓"]
R -- "UserTemplateError / TemplateSyntaxError" --> T["wrap in generic message"]
Reviews (3): Last reviewed commit: "Merge branch 'main' into fix/template-re..." | Re-trigger Greptile
| gate_accessors = sample_accessors[: len(sample_prefix) + 1] | ||
| gate_chain = _format_access_chain(sample_name, gate_accessors) |
There was a problem hiding this comment.
Gate expression is one step too deep when the root variable is absent from the record
resolve_access_chain returns (False, None, []) both when the root name is missing from record (name not in record) and when the first accessor fails — both yield prefix = []. The gate computation sample_accessors[:len(sample_prefix)+1] therefore produces sample_accessors[:1] in either scenario. For {{ person.address.street }} with record = {}, that gives gate_chain = "person.address", so the suggestion becomes {{ person.address.street if person.address else 'N/A' }}. But in Jinja2, person is Undefined when it is absent from the context, and Undefined.__getattr__ raises UndefinedError immediately — the suggested template itself fails with the same kind of error the user is trying to fix.
The gate should fall back to the root name alone (gate_accessors = []) when sample_name is not present in record.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py
Line: 736-737
Comment:
**Gate expression is one step too deep when the root variable is absent from the record**
`resolve_access_chain` returns `(False, None, [])` both when the root name is missing from `record` (`name not in record`) and when the first accessor fails — both yield `prefix = []`. The gate computation `sample_accessors[:len(sample_prefix)+1]` therefore produces `sample_accessors[:1]` in either scenario. For `{{ person.address.street }}` with `record = {}`, that gives `gate_chain = "person.address"`, so the suggestion becomes `{{ person.address.street if person.address else 'N/A' }}`. But in Jinja2, `person` is `Undefined` when it is absent from the context, and `Undefined.__getattr__` raises `UndefinedError` immediately — the suggested template itself fails with the same kind of error the user is trying to fix.
The gate should fall back to the root name alone (`gate_accessors = []`) when `sample_name` is not present in `record`.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in 6d258f1. When sample_name is not in record, the gate now falls back to the root name alone (gate_accessors = []) instead of slicing one accessor past an empty prefix. The resulting suggestion ({{ person.address.street if person else 'N/A' }} / SkipConfig(when="{{ not person }}")) is safe to evaluate because Jinja's default Undefined is falsy and stringifies to "", whereas Undefined.__getattr__ raises.
Added two regression tests covering this exact scenario:
test_undefined_root_variable_produces_safe_remediation_template— asserts the suggested template uses the root-name-only gate (and does not include the broken.addressaccessor).test_undefined_root_variable_remediation_template_is_renderable— actually renders the suggested template against{}and verifies it returns'N/A'instead of raising, which would have caught the original bug.
| # contain its own variable references. | ||
| visit(node.arg, in_chain=False) | ||
| return | ||
| if isinstance(node, j_nodes.Name): |
There was a problem hiding this comment.
I think this walker needs to account for Jinja-scoped names. For {% for person in people %}{{ person.name }}{% endfor %} with an empty people list, the new diagnostic says person / person.name are missing from the record and suggests SkipConfig(when="{{ not person }}"), but person is only a loop variable. Maybe filter chains to roots from meta.find_undeclared_variables(ast), or track For target bindings before collecting culprit chains.
There was a problem hiding this comment.
Good catch — fixed in 6d258f1. Rather than hand-rolling scope tracking inside ast_extract_access_chains, I kept the walker pure and filtered the chains in _build_empty_render_message via meta.find_undeclared_variables, which is Jinja's canonical mechanism for distinguishing external references from loop targets (and any future scoping constructs):
chains = ast_extract_access_chains(ast)
undeclared = meta.find_undeclared_variables(ast)
chains = [(name, accessors) for name, accessors in chains if name in undeclared]For {% for person in people %}{{ person.name }}{% endfor %} with people=[], this leaves only ("people", []) as a candidate culprit. Combined with the empty-collection classifier (see your follow-up comment), the diagnostic now correctly points at people. Locked in by test_loop_variable_is_not_reported_as_missing_culprit.
| resolved, value, prefix = resolve_access_chain(record, name, accessors) | ||
| if not resolved: | ||
| return (_CULPRIT_MISSING, prefix) | ||
| if value is None: |
There was a problem hiding this comment.
Claude caught one follow-up on the culprit classification: after loop-local names are filtered out, an empty loop like {% for item in items %}x{% endfor %} with items=[] still falls back to the no-culprit message because _classify_chain does not treat empty collections as empty values. Could add an empty collection reason here so the message points at items instead of losing the row-level detail.
There was a problem hiding this comment.
Fixed in 6d258f1. Added a new _CULPRIT_EMPTY_COLLECTION = "resolved to empty collection" classification in _classify_chain that matches any empty list/dict/tuple/set/frozenset:
if isinstance(value, (list, dict, tuple, set, frozenset)) and len(value) == 0:
return (_CULPRIT_EMPTY_COLLECTION, prefix)For {% for item in items %}x{% endfor %} with items=[], the diagnostic now reads:
Likely culprits in this row:
- items (resolved to empty collection)
Combined with the scope filtering from your other comment, the suggested fix correctly gates on items. Locked in by test_empty_collection_iterable_reported_as_culprit and test_empty_dict_iterable_reported_as_culprit.
Addresses the open review comments on #633: 1. (Greptile P1) Gate expression in the suggested remediation template was one accessor too deep when the root variable was entirely absent from the record, causing the suggested fix to itself raise UndefinedError. Fall back to gating on the root name alone when sample_name is not in record. 2. (andreatgretel) The AST walker reported loop-local names as missing culprits (e.g. ``person`` in ``{% for person in people %}...{% endfor %}``). Filter extracted chains through ``meta.find_undeclared_variables`` to defer to Jinja's canonical scope tracking. 3. (andreatgretel follow-up) Empty collections used as loop iterables (``items=[]``) fell through to the no-culprit fallback. Add a new ``_CULPRIT_EMPTY_COLLECTION`` classification so they're surfaced. 4. Minor: add ``from exception`` to ``safe_render``'s UndefinedError re-raise for traceback consistency with the native engine path, and add a note on the load-bearing exception ordering in ``sanitize_user_exceptions``.
Summary
Closes #629.
Empty-render and missing-attribute Jinja failures used to surface as the same unhelpful message:
Two distinct failure modes were collapsing into that generic string:
{{ person.preferred_english_name }}where the key is missing):_assert_rendered_text_not_emptyraised a descriptiveUserTemplateError, butsanitize_user_exceptionsthen replaced the message with the generic one.UndefinedError(e.g.{{ person.address.street }}whereaddressis missing): ajinja2.exceptions.UndefinedErrorleaked all the way up because the sanitizer only caughtUserTemplateError/TemplateSyntaxError.Both paths now raise a new
EmptyTemplateRenderError(UserTemplateError)subclass that the sanitizer is explicitly told to pass through. The exception carries an actionable, row-level message naming the offending chain plus copy-pasteable Jinja conditional and SkipConfig remediation patterns:The "gate" expression in the suggestions stops one accessor past the resolvable prefix so it stays safe to evaluate (avoiding a fresh
UndefinedErroron the suggested fix itself).What changed
packages/data-designer-engine/src/data_designer/engine/processing/ginja/:exceptions.py— newEmptyTemplateRenderErrorsubclass.ast.py— new public helpersast_extract_access_chainsandresolve_access_chain(plus anAccessChaintype alias).environment.py:_assert_rendered_text_not_empty/validate_rendered_textthread the template + record through and raise the new error with the per-row diagnostic.UserTemplateSandboxEnvironment.safe_rendercatchesUndefinedErrorand converts it to the sameEmptyTemplateRenderError.NativeJinjaSandboxEnvironment.render_templatemirrors the conversion for consistency.sanitize_user_exceptionsbypassesEmptyTemplateRenderError(same treatment asUserTemplateUnsupportedFiltersError).End-to-end propagation already prepends the column name via
_run_batch, so no other layers needed changes.Test plan
test_empty_render_raises_empty_template_render_error_with_culprit_chain,test_undefined_nested_attr_raises_empty_template_render_error_with_safe_gate).NativeJinjaSandboxEnvironmentalso convertsUndefinedError.ast_extract_access_chains(plain text, single name, nested attr, subscript, mixed, dynamic subscript skipped, loop/conditional context) andresolve_access_chain(resolved, partially-resolved, type-mismatch cases).match="invalid"tomatch="empty"to reflect the new (more informative) message.uv run pytest packages/data-designer-engine/tests/engine/processing/ginja/— 95 passed.uv run ruff format+uv run ruff check— clean.