fix(engine): actionable error when a Jinja field is missing/None/empty by nabinchha · Pull Request #633 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-05-11T17:30:22Z

Summary

Closes #629.

Empty-render and missing-attribute Jinja failures used to surface as the same unhelpful message:

🛑 Failed to process column 'preferred_english_name': User provided prompt generation template is invalid.

Two distinct failure modes were collapsing into that generic string:

Bug 1 — empty render (e.g. {{ person.preferred_english_name }} where the key is missing): _assert_rendered_text_not_empty raised a descriptive UserTemplateError, but sanitize_user_exceptions then replaced the message with the generic one.
Bug 3 — raw UndefinedError (e.g. {{ person.address.street }} where address is missing): a jinja2.exceptions.UndefinedError leaked all the way up because the sanitizer only caught UserTemplateError / TemplateSyntaxError.

Both paths now raise a new EmptyTemplateRenderError(UserTemplateError) subclass that the sanitizer is explicitly told to pass through. The exception carries an actionable, row-level message naming the offending chain plus copy-pasteable Jinja conditional and SkipConfig remediation patterns:

Template rendered to empty text. This usually happens when one or more
referenced fields are missing, None, or empty in this row.

Likely culprits in this row:
  - person.address.street (missing from record)

To handle missing values, you can:

  1. Provide a fallback in your template using a Jinja conditional:
       {{ person.address.street if person.address else 'N/A' }}

  2. Skip rows where required fields are missing using SkipConfig:
       skip=SkipConfig(when="{{ not person.address }}")

The "gate" expression in the suggestions stops one accessor past the resolvable prefix so it stays safe to evaluate (avoiding a fresh UndefinedError on the suggested fix itself).

What changed

packages/data-designer-engine/src/data_designer/engine/processing/ginja/:

exceptions.py — new EmptyTemplateRenderError subclass.
ast.py — new public helpers ast_extract_access_chains and resolve_access_chain (plus an AccessChain type alias).
environment.py:
- _assert_rendered_text_not_empty / validate_rendered_text thread the template + record through and raise the new error with the per-row diagnostic.
- UserTemplateSandboxEnvironment.safe_render catches UndefinedError and converts it to the same EmptyTemplateRenderError. NativeJinjaSandboxEnvironment.render_template mirrors the conversion for consistency.
- sanitize_user_exceptions bypasses EmptyTemplateRenderError (same treatment as UserTemplateUnsupportedFiltersError).

End-to-end propagation already prepends the column name via _run_batch, so no other layers needed changes.

Test plan

New regression tests for both bugs (test_empty_render_raises_empty_template_render_error_with_culprit_chain, test_undefined_nested_attr_raises_empty_template_render_error_with_safe_gate).
Coverage for the "resolved-to-None" / "resolved-to-empty-string" / multi-culprit branches.
Test that the sanitizer no longer collapses the new error to the generic message.
Test that NativeJinjaSandboxEnvironment also converts UndefinedError.
Parametrized coverage for ast_extract_access_chains (plain text, single name, nested attr, subscript, mixed, dynamic subscript skipped, loop/conditional context) and resolve_access_chain (resolved, partially-resolved, type-mismatch cases).
Happy-path render still works; one image-cell-generator test was updated from match="invalid" to match="empty" to reflect the new (more informative) message.
uv run pytest packages/data-designer-engine/tests/engine/processing/ginja/ — 95 passed.
uv run ruff format + uv run ruff check — clean.

Empty-render and missing-attribute failures used to surface as the generic "User provided prompt generation template is invalid." either because `sanitize_user_exceptions` stripped the detail or because Jinja's raw `UndefinedError` leaked through. Both now raise a new `EmptyTemplateRenderError` carrying a row-level diagnostic that names the offending chain and includes copy-pasteable Jinja conditional and SkipConfig fix patterns. Closes #629.

github-actions · 2026-05-11T17:32:57Z

PR #633 Review — `fix(engine): actionable error when a Jinja field is missing/None/empty`

Summary

Closes #629. Two distinct failure modes in Jinja rendering used to collapse into the same unhelpful "User provided prompt generation template is invalid." message:

Empty render — template like {{ person.preferred_english_name }} where the key is missing on the record.
Raw UndefinedError — template like {{ person.address.street }} where person.address is missing, leaking Jinja's internal "'dict object' has no attribute 'address'".

The PR introduces EmptyTemplateRenderError(UserTemplateError) and teaches sanitize_user_exceptions to pass it through unaltered. Both the secure (UserTemplateSandboxEnvironment.safe_render) and native (NativeJinjaSandboxEnvironment.render_template) engines now convert UndefinedError + empty-render cases to this new error, which carries a row-level diagnostic naming culprit chains and giving copy-pasteable Jinja-conditional and SkipConfig remediation patterns. Two new AST helpers (ast_extract_access_chains, resolve_access_chain) back the culprit detection.

Findings

Correctness

Exception chaining inconsistency (minor). NativeJinjaSandboxEnvironment.render_template uses raise EmptyTemplateRenderError(...) from exception (environment.py:248-251), preserving the original UndefinedError in __cause__. UserTemplateSandboxEnvironment.safe_render re-raises without from exception (environment.py:402). The messages are curated so there's no user-visible difference, but __cause__ is useful when someone is debugging from a traceback. Recommend raise EmptyTemplateRenderError(...) from exception in safe_render too.
_build_empty_render_message fallback path is robust. Wrapping parser(user_template) + ast_extract_access_chains(...) in try/except Exception and falling back to a generic message avoids masking the original render failure. Good defensive posture.
resolve_access_chain list bounds. Correctly rejects out-of-bounds indices (acc >= len(current) or acc < -len(current)). Negative literal subscripts (e.g. {{ items[-1] }}) would reach resolve_access_chain as int accessors through _build_access_chain, and Python's native negative indexing on current[acc] handles them. Good.
Dict/list/scalar branching is complete. isinstance(current, dict), isinstance(current, list), fallthrough else — the three cases cover the sanitized record value types.
Gate expression derivation is safe. The one-past-resolvable-prefix slice (sample_accessors[: len(sample_prefix) + 1]) yields a Jinja Undefined leaf on evaluation, which stringifies to "" and is falsy — doesn't re-trigger UndefinedError. This is subtle; the inline comment at environment.py explains the reasoning clearly.
culprits[0] for the remediation suggestion. Only the first culprit becomes the template for the remediation example. With multiple culprits listed in the bullets, a user might fix only the first and rerun. Arguably OK (they'll iterate), but consider noting "(showing fix for first culprit)" if user feedback reports confusion.
Sanitizer order. sanitize_user_exceptions now catches (UserTemplateUnsupportedFiltersError, EmptyTemplateRenderError) before (UserTemplateError, TemplateSyntaxError). Since Python matches except clauses in order and both are UserTemplateError subclasses, the ordering is load-bearing. Correct as written. A short comment noting "ordering matters — these are UserTemplateError subclasses" would help future maintainers.
Dynamic subscripts. _build_access_chain returns None for a[b]-style chains, and the visitor descends into the subscript expression with in_chain=False, extracting b as its own chain. Matches the docstring behavior and the parametrized test coverage.

Project conventions

from __future__ import annotations present in all new/modified files ✓
Absolute imports only ✓
Modern type syntax (list[str], str | None, AccessChain = tuple[str, list[str | int]]) ✓
Type annotations on all new public callables ✓
EmptyTemplateRenderError subclasses UserTemplateError, preserving the canonical-error-at-boundaries invariant from AGENTS.md ✓
New AST helpers land in ginja/ast.py alongside existing helpers; _build_access_chain is private — good separation.
Module-level _CULPRIT_* sentinels are fine, but a small Enum or Literal type alias would make the classifier signature self-documenting. Not worth blocking on.

Tests

95 tests in ginja/ passing per the PR description; diff shows thorough new coverage:
- test_ast.py: parametrized chain-extraction (plain, nested, subscript, mixed, dynamic-subscript skip, loop/conditional context), parametrized resolve_access_chain (fully resolved, partially resolved, type-mismatch, list bounds).
- test_environment.py: both bug regressions, culprit-classification branches (missing / None / empty string), multi-culprit listing, sanitizer-bypass assertion, native-engine parity, happy path, skip_template_validation=True path.
The image-cell-generator test was updated from match="invalid" to match="empty" with an explanatory comment pointing at issue Cryptic "User provided prompt generation template is invalid" error when a referenced template field is empty or missing #629. Good — this is exactly the kind of test churn worth annotating.
Gap (small): No test asserts the fallback message when _build_empty_render_message's parse branch fails (e.g., a monkey-patched parser that throws). The code path exists (environment.py:346-352) but is not covered. Arguably not worth chasing given how defensive it is.
Gap (small): No test for a chain with a non-identifier string key rendered via [...] bracket notation (e.g. {{ data["weird key"] }}). _format_access_chain has the branch; a parametrized case would close the loop.

Performance

Culprit detection runs only on the error path. It reparses user_template once per failure. Negligible.
ast_extract_access_chains is O(n) in AST size; visit does no backtracking.

Security

No new attack surface. The helpers operate on already-sanitized records and the existing trusted parse path.
The curated error message embeds user data (record keys in the access chain) into the exception string. This is the same blast radius as prior UserTemplateError messages — acceptable.

Documentation

Docstrings on new helpers are thorough and accurate. safe_render's docstring is updated to mention EmptyTemplateRenderError. ✓
Inline comments explain the non-obvious bits (top-level chain extraction, safe gate derivation, why the sanitizer bypasses this error class).

Suggested follow-ups (non-blocking)

Add from exception to the UndefinedError → EmptyTemplateRenderError re-raise in UserTemplateSandboxEnvironment.safe_render for traceback consistency with the native engine path.
Consider a one-line comment on the sanitize_user_exceptions except-clause noting that ordering is load-bearing.
Add a test case for non-identifier string subscripts in _format_access_chain.
If user reports come in about the multi-culprit case being ambiguous, consider labelling the remediation example ("fix for first culprit shown").

Verdict

Approve with minor nits. The change is well-scoped, well-tested, and genuinely fixes the reported bug class (both the "empty render" and "undefined nested attr" variants) without expanding blast radius. Exception taxonomy and sanitizer bypass are implemented correctly. The nits above are small-cost polish; none block merge.

greptile-apps · 2026-05-11T17:37:21Z

Greptile Summary

This PR fixes two distinct failure modes in Jinja template error reporting — both previously collapsed into a generic "template is invalid" message — by introducing a new EmptyTemplateRenderError subclass that carries a row-level diagnostic identifying the offending field chain and suggesting copy-pasteable Jinja conditional / SkipConfig remediation.

exceptions.py: Adds EmptyTemplateRenderError(UserTemplateError) so the sanitizer decorator can pass it through unchanged instead of replacing its message.
ast.py: New ast_extract_access_chains and resolve_access_chain helpers walk the template AST and record dict to identify which field chain caused the empty render, with proper handling for loop-scoped variables, dynamic subscripts, and the root-absent case.
environment.py: Both UserTemplateSandboxEnvironment and NativeJinjaSandboxEnvironment now convert raw UndefinedError to EmptyTemplateRenderError; _assert_rendered_text_not_empty / validate_rendered_text accept optional template + record to build the same diagnostic on empty-string renders; sanitize_user_exceptions is updated to pass the new subclass through.

Confidence Score: 5/5

The change is safe to merge — it replaces a generic error message with a diagnostically richer one and does not touch any data-mutation or auth paths.

Both failure modes are correctly intercepted, converted to the new subclass, and passed through the sanitizer unchanged. The gate-expression fix for the root-absent case is verified by two dedicated regression tests. The culprit-finding path wraps all its work in a broad try/except so failures there fall back gracefully to the generic message. Test coverage spans the full matrix of culprit types, scoping edge cases, and both sandbox environments.

No files require special attention.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/processing/ginja/exceptions.py	Adds EmptyTemplateRenderError as a UserTemplateError subclass; minimal change, correctly structured for the sanitizer's exception hierarchy ordering.
packages/data-designer-engine/src/data_designer/engine/processing/ginja/ast.py	New ast_extract_access_chains and resolve_access_chain helpers; logic for top-level chain extraction, dynamic-subscript skipping, and partial-resolution tracking is correct and well-tested.
packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py	UndefinedError handling and empty-render diagnostic wired into both sandbox environments; gate_accessors logic correctly handles the root-absent case; sanitizer ordering comment added for future maintainers.
packages/data-designer-engine/tests/engine/processing/ginja/test_environment.py	Comprehensive regression tests for both bugs, all culprit-classification branches, sanitizer bypass, loop-variable scoping, empty-collection reporting, and safe gate generation including the root-absent regression case.
packages/data-designer-engine/tests/engine/processing/ginja/test_ast.py	Parametrized tests for ast_extract_access_chains and resolve_access_chain cover plain text, nested attrs, subscripts, dynamic subscripts, loop/conditional contexts, and failure/type-mismatch cases.
packages/data-designer-engine/tests/engine/column_generators/generators/test_image.py	Updated match string from 'invalid' to 'empty' to reflect the improved error message; correctly retains UserTemplateError as the expected type (EmptyTemplateRenderError is a subclass).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["safe_render / render_template called"] --> B["template.render(record)"]
    B --> C{Exception?}
    C -- "UndefinedError" --> D["_build_empty_render_message(template, record, parse)"]
    C -- "SecurityError / OverflowError" --> E["UserTemplateError (generic)"]
    C -- No exception --> F["validate_rendered_text(rendered_text, template, record)"]
    F --> G{rendered empty?}
    G -- Yes --> D
    G -- No --> H["return rendered_text ✓"]
    D --> I["ast_extract_access_chains(ast)"]
    I --> J["filter by meta.find_undeclared_variables"]
    J --> K["_collect_culprit_chains(chains, record)"]
    K --> L["resolve_access_chain per chain"]
    L --> M{Culprits found?}
    M -- Yes --> N["Build actionable message\nwith gate expression"]
    M -- No --> O["Generic fallback message"]
    N --> P["raise EmptyTemplateRenderError"]
    O --> P
    P --> Q["sanitize_user_exceptions wrapper"]
    Q --> R{Exception type?}
    R -- "EmptyTemplateRenderError\nor UnsupportedFiltersError" --> S["re-raise as-is ✓"]
    R -- "UserTemplateError / TemplateSyntaxError" --> T["wrap in generic message"]

_{Reviews (3): Last reviewed commit: "Merge branch 'main' into fix/template-re..." | Re-trigger Greptile}

greptile-apps · 2026-05-11T17:37:26Z

+    gate_accessors = sample_accessors[: len(sample_prefix) + 1]
+    gate_chain = _format_access_chain(sample_name, gate_accessors)


Gate expression is one step too deep when the root variable is absent from the record

resolve_access_chain returns (False, None, []) both when the root name is missing from record (name not in record) and when the first accessor fails — both yield prefix = []. The gate computation sample_accessors[:len(sample_prefix)+1] therefore produces sample_accessors[:1] in either scenario. For {{ person.address.street }} with record = {}, that gives gate_chain = "person.address", so the suggestion becomes {{ person.address.street if person.address else 'N/A' }}. But in Jinja2, person is Undefined when it is absent from the context, and Undefined.__getattr__ raises UndefinedError immediately — the suggested template itself fails with the same kind of error the user is trying to fix.

The gate should fall back to the root name alone (gate_accessors = []) when sample_name is not present in record.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/data-designer-engine/src/data_designer/engine/processing/ginja/environment.py Line: 736-737 Comment: **Gate expression is one step too deep when the root variable is absent from the record** `resolve_access_chain` returns `(False, None, [])` both when the root name is missing from `record` (`name not in record`) and when the first accessor fails — both yield `prefix = []`. The gate computation `sample_accessors[:len(sample_prefix)+1]` therefore produces `sample_accessors[:1]` in either scenario. For `{{ person.address.street }}` with `record = {}`, that gives `gate_chain = "person.address"`, so the suggestion becomes `{{ person.address.street if person.address else 'N/A' }}`. But in Jinja2, `person` is `Undefined` when it is absent from the context, and `Undefined.__getattr__` raises `UndefinedError` immediately — the suggested template itself fails with the same kind of error the user is trying to fix. The gate should fall back to the root name alone (`gate_accessors = []`) when `sample_name` is not present in `record`. How can I resolve this? If you propose a fix, please make it concise.

Fixed in 6d258f1. When sample_name is not in record, the gate now falls back to the root name alone (gate_accessors = []) instead of slicing one accessor past an empty prefix. The resulting suggestion ({{ person.address.street if person else 'N/A' }} / SkipConfig(when="{{ not person }}")) is safe to evaluate because Jinja's default Undefined is falsy and stringifies to "", whereas Undefined.__getattr__ raises.

Added two regression tests covering this exact scenario:

test_undefined_root_variable_produces_safe_remediation_template — asserts the suggested template uses the root-name-only gate (and does not include the broken .address accessor).

test_undefined_root_variable_remediation_template_is_renderable — actually renders the suggested template against {} and verifies it returns 'N/A' instead of raising, which would have caught the original bug.

andreatgretel · 2026-05-12T12:07:45Z

+                # contain its own variable references.
+                visit(node.arg, in_chain=False)
+            return
+        if isinstance(node, j_nodes.Name):


I think this walker needs to account for Jinja-scoped names. For {% for person in people %}{{ person.name }}{% endfor %} with an empty people list, the new diagnostic says person / person.name are missing from the record and suggests SkipConfig(when="{{ not person }}"), but person is only a loop variable. Maybe filter chains to roots from meta.find_undeclared_variables(ast), or track For target bindings before collecting culprit chains.

Good catch — fixed in 6d258f1. Rather than hand-rolling scope tracking inside ast_extract_access_chains, I kept the walker pure and filtered the chains in _build_empty_render_message via meta.find_undeclared_variables, which is Jinja's canonical mechanism for distinguishing external references from loop targets (and any future scoping constructs):

chains = ast_extract_access_chains(ast) undeclared = meta.find_undeclared_variables(ast) chains = [(name, accessors) for name, accessors in chains if name in undeclared]

For {% for person in people %}{{ person.name }}{% endfor %} with people=[], this leaves only ("people", []) as a candidate culprit. Combined with the empty-collection classifier (see your follow-up comment), the diagnostic now correctly points at people. Locked in by test_loop_variable_is_not_reported_as_missing_culprit.

andreatgretel · 2026-05-12T12:08:07Z

+    resolved, value, prefix = resolve_access_chain(record, name, accessors)
+    if not resolved:
+        return (_CULPRIT_MISSING, prefix)
+    if value is None:


Claude caught one follow-up on the culprit classification: after loop-local names are filtered out, an empty loop like {% for item in items %}x{% endfor %} with items=[] still falls back to the no-culprit message because _classify_chain does not treat empty collections as empty values. Could add an empty collection reason here so the message points at items instead of losing the row-level detail.

Fixed in 6d258f1. Added a new _CULPRIT_EMPTY_COLLECTION = "resolved to empty collection" classification in _classify_chain that matches any empty list/dict/tuple/set/frozenset:

if isinstance(value, (list, dict, tuple, set, frozenset)) and len(value) == 0: return (_CULPRIT_EMPTY_COLLECTION, prefix)

For {% for item in items %}x{% endfor %} with items=[], the diagnostic now reads:

Likely culprits in this row: - items (resolved to empty collection)

Combined with the scope filtering from your other comment, the suggested fix correctly gates on items. Locked in by test_empty_collection_iterable_reported_as_culprit and test_empty_dict_iterable_reported_as_culprit.

Addresses the open review comments on #633: 1. (Greptile P1) Gate expression in the suggested remediation template was one accessor too deep when the root variable was entirely absent from the record, causing the suggested fix to itself raise UndefinedError. Fall back to gating on the root name alone when sample_name is not in record. 2. (andreatgretel) The AST walker reported loop-local names as missing culprits (e.g. ``person`` in ``{% for person in people %}...{% endfor %}``). Filter extracted chains through ``meta.find_undeclared_variables`` to defer to Jinja's canonical scope tracking. 3. (andreatgretel follow-up) Empty collections used as loop iterables (``items=[]``) fell through to the no-culprit fallback. Add a new ``_CULPRIT_EMPTY_COLLECTION`` classification so they're surfaced. 4. Minor: add ``from exception`` to ``safe_render``'s UndefinedError re-raise for traceback consistency with the native engine path, and add a note on the load-bearing exception ordering in ``sanitize_user_exceptions``.

nabinchha requested a review from a team as a code owner May 11, 2026 17:30

nabinchha temporarily deployed to agentic-ci May 11, 2026 17:30 — with GitHub Actions Inactive

greptile-apps Bot reviewed May 11, 2026

View reviewed changes

andreatgretel reviewed May 12, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 12, 2026

Agentic CI: Issue & PR Triage Tracker #562

Open

nabinchha requested a review from andreatgretel May 18, 2026 15:45

Merge branch 'main' into fix/template-render-errors-629

5e49013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(engine): actionable error when a Jinja field is missing/None/empty#633

fix(engine): actionable error when a Jinja field is missing/None/empty#633
nabinchha wants to merge 3 commits into
mainfrom
fix/template-render-errors-629

nabinchha commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

greptile-apps Bot commented May 11, 2026 •

edited

Loading

Confidence Score: 5/5

Flowchart

Uh oh!

greptile-apps Bot May 11, 2026

Uh oh!

nabinchha May 18, 2026

Uh oh!

andreatgretel May 12, 2026

Uh oh!

nabinchha May 18, 2026

Uh oh!

andreatgretel May 12, 2026

Uh oh!

nabinchha May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		gate_accessors = sample_accessors[: len(sample_prefix) + 1]
		gate_chain = _format_access_chain(sample_name, gate_accessors)

Conversation

nabinchha commented May 11, 2026

Summary

What changed

Test plan

Uh oh!

github-actions Bot commented May 11, 2026

PR #633 Review — fix(engine): actionable error when a Jinja field is missing/None/empty

Summary

Findings

Correctness

Project conventions

Tests

Performance

Security

Documentation

Suggested follow-ups (non-blocking)

Verdict

Uh oh!

greptile-apps Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha May 18, 2026

Choose a reason for hiding this comment

Uh oh!

andreatgretel May 12, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha May 18, 2026

Choose a reason for hiding this comment

Uh oh!

andreatgretel May 12, 2026

Choose a reason for hiding this comment

Uh oh!

nabinchha May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PR #633 Review — `fix(engine): actionable error when a Jinja field is missing/None/empty`

greptile-apps Bot commented May 11, 2026 •

edited

Loading