feat: DE/EU locale support with German data source connectors#3
Open
10fra wants to merge 2 commits intoShinMegamiBoson:mainfrom
Open
feat: DE/EU locale support with German data source connectors#310fra wants to merge 2 commits intoShinMegamiBoson:mainfrom
10fra wants to merge 2 commits intoShinMegamiBoson:mainfrom
Conversation
- Add --locale flag (us|de) to CLI and AgentConfig - Create agent/connectors/ package with shared HTTP helper - Add Lobbyregister Bundestag API connector - Add abgeordnetenwatch.de API v2 connector - Add OffeneRegister bulk JSONL search connector - Add EU Transparency Register bulk CSV/XML connector - Create agent/normalizers/ with German entity normalization (umlauts, legal forms, titles, courts) and composite entity resolver - Wire 4 DE-locale tools into tool_defs, engine dispatch, and WorkspaceTools - Add DE prompt localization (entity resolution + political context sections)
feat: DE/EU locale support with German data source connectors
Author
|
Vibe coded mostly. Careful. I've tested and run it myself on a couple of tasks and it works but minimal code review from me. Should have no impact on US locale. Just close if not wanted. I'll keep working on my fork anyway |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
--locale deflag that switches the agent to German/EU data sources, entity resolution rules, and political context. US remains the default. No new dependencies.Four free-tier connectors:
German entity normalization layer handles umlaut folding (ä↔ae, ß↔ss), legal form extraction (GmbH, KGaA, e.V., etc.), title stripping (Dr., Prof., von/zu), and court alias resolution. Composite entity matcher scores register-number matches at 1.0, normalized-name+form+city at 0.85, officer-overlap+city at 0.6.
System prompt gains two locale-gated sections: German entity resolution guidance (HRB numbering, umlaut matching, title stripping) and political/legal context (Bundestag structure, key terminology like Karenzzeit/Drehtür/Nebeneinkünfte, available vs restricted data sources).
Changes
agent/config.py—locale,lobbyregister_api_keyfieldsagent/__main__.py—--localeCLI argagent/connectors/— new package: shared HTTP helper + 4 connectorsagent/normalizers/— new package:german.py(name/court/legal form normalization),entity_resolver.py(composite matching)agent/tool_defs.py— 4 DE tool schemas,localeparam onget_tool_definitions()agent/engine.py— dispatch branches for DE tools, locale threadingagent/tools.py— connector wrappers + bulk data file finderagent/prompts.py—DE_ENTITY_RESOLUTION_SECTION,DE_CONTEXT_SECTIONagent/builder.py— passes locale/api_key to WorkspaceToolsTest plan
python3 -m agent --locale de --task "Search the Lobbyregister for Deutsche Bank"exercises full chain--locale us(default) exposes no DE toolsMüller GmbH & Co. KG→mueller