You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After processing an episode, two enrich_entity_wikidata workflows show as ERROR in the Episode admin's "View workflow steps" page (/admin/episodes/episode/<id>/dbos-steps/), with no step records:
So the strings were already what the resolver enqueued — this is a bug inside resolver.resolve_entities(), not in the workflow plumbing or in DBOS replay/deserialization.
Steps to reproduce
Fresh dev DB:
uv run python manage.py dbreset --yes
uv run python manage.py migrate
uv run python manage.py load_entity_types
Start an ASGI worker:
uv run uvicorn ragtime.asgi:application --host 127.0.0.1 --port 8000
From a separate terminal, submit any episode whose extracted text plausibly produces words like "status" or "episode" (the ARD URL we used reproduces it):
uv run python manage.py submit_episode "https://www.ardsounds.de/episode/urn:ard:episode:fdcf93eef8395b35/"
Wait for the pipeline to complete. Check /admin/episodes/episode/<id>/dbos-steps/ — two enrich_entity_wikidata workflows should appear with status ERROR.
Reproducible on main (the resolver/enrichment code paths are unchanged by recent PRs #129/#131).
Diagnostic queries
Run against the dev DB (substitute $RAGTIME_DB_USER for your configured user):
1. Verify EntityType keys are clean — should be the 14 canonical jazz types, no status / episode_id:
docker exec ragtime-postgres-1 psql -U "$RAGTIME_DB_USER" -d ragtime -c \"
SELECT key, name, is_active FROM episodes_entitytype ORDER BY key;\"
2. Verify chunk.entities_json top-level keys are clean — same expectation:
docker exec ragtime-postgres-1 psql -U \"$RAGTIME_DB_USER\" -d ragtime -c \"
SELECT DISTINCT jsonb_object_keys(entities_json::jsonb) AS type_key
FROM episodes_chunk
WHERE episode_id = 1 AND entities_json IS NOT NULL
ORDER BY type_key;\"
3. List entities actually persisted for the episode — does any have a name that matches the bad strings? Are PKs all integers as expected?
docker exec ragtime-postgres-1 psql -U \"$RAGTIME_DB_USER\" -d ragtime -c \"
SELECT e.id, e.name, et.key AS type_key, e.wikidata_status, e.wikidata_attempts
FROM episodes_entity e
JOIN episodes_entitytype et ON et.id = e.entity_type_id
WHERE e.id IN (
SELECT DISTINCT entity_id FROM episodes_entitymention WHERE episode_id = 1
)
ORDER BY e.id;\"
4. Inspect the failed enrichment workflows' pickled inputs — confirms what was on the wire:
docker exec ragtime-postgres-1 psql -U \"$RAGTIME_DB_USER\" -d ragtime -c \"
SELECT workflow_uuid, status, substring(inputs from 1 for 200) AS inputs_preview
FROM dbos.workflow_status
WHERE name LIKE '%enrich_entity_wikidata%'
ORDER BY created_at DESC LIMIT 5;\"
The inputs column is base64-encoded pickle. Decode in Python:
5. Inspect the chunk JSON content for entity names (not just type keys) — the names that flow into unique_names in the resolver:
docker exec ragtime-postgres-1 psql -U \"$RAGTIME_DB_USER\" -d ragtime -c \"
SELECT index, jsonb_pretty(entities_json::jsonb)
FROM episodes_chunk
WHERE episode_id = 1 AND entities_json IS NOT NULL
ORDER BY index LIMIT 1;\"
Code-level analysis
The only code path that populates entities_to_enrich is in episodes/resolver.py:
# resolver.py:274entities_to_enrich: set[int] =set()
# resolver.py:284 — the only .add() calldef_maybe_enqueue(entity: Entity) ->None:
if (...):
entities_to_enrich.add(entity.pk)
_maybe_enqueue is called from three places, each of which assigns entity from either:
All three should yield Entity instances with integer pk. The bug must therefore be one of:
Some other call site we haven't found that mutates entities_to_enrich (e.g. .update() with an iterable of strings).
An unexpected object passed to _maybe_enqueue that has .wikidata_id / .wikidata_status / .wikidata_attempts attributes (so the guard succeeds) but whose .pk returns a string. No model in the codebase obviously fits that shape.
An LLM resolution response leaking — e.g. match[\"matched_entity_id\"] returns the string \"status\", which then ends up assigned to entity somehow. The resolution schema constrains it to [\"integer\", \"null\"], but a non-strict provider might let strings through.
The two strings (status, episode_id) match Django Episode-model and EntityMention-FK field names respectively. That co-occurrence suggests something is iterating over a Django model's field-name introspection (_meta.fields, __dict__, etc.) rather than over actual entity IDs — but I haven't located that path. Worth scanning for any code that hands entity._meta or similar to the resolver.
Impact
Per occurrence: two Entity rows whose enrichment is permanently stuck — wikidata_status='pending', wikidata_attempts=1 after the failed run. manage.py enrich_entities will retry them, but it'll re-fetch with the same string IDs and fail the same way (the workflow re-uses the persisted bad args via DBOS workflow recovery semantics).
Per episode: the main pipeline succeeds and the episode reaches ready, so the user-facing impact is missing Wikidata IDs on a couple of entities. Search-time hydration in vector_store.search_chunks() simply returns None for those entities' Q-IDs.
Identify the call site that produces string values in entities_to_enrich. (Most likely a small targeted change once found.)
Add a unit test against resolver.resolve_entities() with a fixture that triggers the path — assert the returned list contains only integers (all(isinstance(x, int) for x in ids)).
Re-run the reproduction steps above; both enrich_entity_wikidata workflows succeed (or short-circuit cleanly per the resolver-level idempotency rules).
Out of scope
Driver-level hardening of _fetch_entity to log + skip on non-int input. Defense-in-depth, but the right fix is to stop the bad data at the source.
Symptom
After processing an episode, two
enrich_entity_wikidataworkflows show as ERROR in the Episode admin's "View workflow steps" page (/admin/episodes/episode/<id>/dbos-steps/), with no step records:The main pipeline workflow (
episode-1-run-1) succeeds — the episode reachesready. Only the downstream Wikidata enrichments fail.The Django/uvicorn log shows two stack traces, both from
Entity.objects.get(pk=entity_id)inepisodes/enrichment.py:_fetch_entity:Smoking gun
The
resolve_stepadmin row shows that the resolver returned literal field-name strings as entity IDs:Confirmed at the queue level by decoding
dbos.workflow_status.inputs(base64 pickle):So the strings were already what the resolver enqueued — this is a bug inside
resolver.resolve_entities(), not in the workflow plumbing or in DBOS replay/deserialization.Steps to reproduce
uv run python manage.py submit_episode "https://www.ardsounds.de/episode/urn:ard:episode:fdcf93eef8395b35/"/admin/episodes/episode/<id>/dbos-steps/— twoenrich_entity_wikidataworkflows should appear with status ERROR.Reproducible on
main(the resolver/enrichment code paths are unchanged by recent PRs #129/#131).Diagnostic queries
Run against the dev DB (substitute
$RAGTIME_DB_USERfor your configured user):1. Verify EntityType keys are clean — should be the 14 canonical jazz types, no
status/episode_id:2. Verify chunk.entities_json top-level keys are clean — same expectation:
3. List entities actually persisted for the episode — does any have a name that matches the bad strings? Are PKs all integers as expected?
4. Inspect the failed enrichment workflows' pickled inputs — confirms what was on the wire:
The
inputscolumn is base64-encoded pickle. Decode in Python:5. Inspect the chunk JSON content for entity names (not just type keys) — the names that flow into
unique_namesin the resolver:Code-level analysis
The only code path that populates
entities_to_enrichis inepisodes/resolver.py:_maybe_enqueueis called from three places, each of which assignsentityfrom either:Entity.objects.get_or_create(...)(via_get_or_create_entity)existing_by_id[matched_id]existing_by_mbid[mbid]All three should yield
Entityinstances with integerpk. The bug must therefore be one of:entities_to_enrich(e.g..update()with an iterable of strings)._maybe_enqueuethat has.wikidata_id/.wikidata_status/.wikidata_attemptsattributes (so the guard succeeds) but whose.pkreturns a string. No model in the codebase obviously fits that shape.match[\"matched_entity_id\"]returns the string\"status\", which then ends up assigned toentitysomehow. The resolution schema constrains it to[\"integer\", \"null\"], but a non-strict provider might let strings through.The two strings (
status,episode_id) match DjangoEpisode-model andEntityMention-FK field names respectively. That co-occurrence suggests something is iterating over a Django model's field-name introspection (_meta.fields,__dict__, etc.) rather than over actual entity IDs — but I haven't located that path. Worth scanning for any code that handsentity._metaor similar to the resolver.Impact
Entityrows whose enrichment is permanently stuck —wikidata_status='pending',wikidata_attempts=1after the failed run.manage.py enrich_entitieswill retry them, but it'll re-fetch with the same string IDs and fail the same way (the workflow re-uses the persisted bad args via DBOS workflow recovery semantics).ready, so the user-facing impact is missing Wikidata IDs on a couple of entities. Search-time hydration invector_store.search_chunks()simply returnsNonefor those entities' Q-IDs.Acceptance criteria
entities_to_enrich. (Most likely a small targeted change once found.)resolver.resolve_entities()with a fixture that triggers the path — assert the returned list contains only integers (all(isinstance(x, int) for x in ids)).enrich_entity_wikidataworkflows succeed (or short-circuit cleanly per the resolver-level idempotency rules).Out of scope
_fetch_entityto log + skip on non-int input. Defense-in-depth, but the right fix is to stop the bad data at the source.