Skip to content

fix: OSD Global-tenant import + dropped report files with glob metacharacters; validate dev stack on OpenSearch 3.x with PostgreSQL#781

Merged
seanthegeek merged 7 commits into
masterfrom
fix-osd-bootstrap-global-tenant
May 21, 2026
Merged

fix: OSD Global-tenant import + dropped report files with glob metacharacters; validate dev stack on OpenSearch 3.x with PostgreSQL#781
seanthegeek merged 7 commits into
masterfrom
fix-osd-bootstrap-global-tenant

Conversation

@seanthegeek
Copy link
Copy Markdown
Contributor

@seanthegeek seanthegeek commented May 21, 2026

Started as the OSD Global-tenant import bug; auditing the rest of the dev stack surfaced real library bugs, a shipped-dashboard mapping conflict, and several dev-stack gaps. Everything was validated end-to-end against a freshly rebuilt stack (OpenSearch 3.6.0 + OSD 3.5).

Bug fixes (released behavior)

  1. OSD saved objects imported into the wrong tenant. dashboard-dev-bootstrap.sh sent securitytenant: global_tenant. The security plugin reads that as a tenant name, and global_tenant is a sample custom tenant in the demo config — not the Global tenant (token global). The import landed in a phantom tenant; Global looked empty. Fixed to global. (Shipped 9.10.3–9.11.2.)

  2. Report files with glob metacharacters in their names were silently dropped. The CLI ran every file argument through glob(), which treats [, ], *, ? as patterns. A literal [Netease DMARC Failure Report] Rent Reminder.eml matched nothing and never reached the parser. Files that exist on disk are now used literally; only non-existent paths are globbed. (This is why "4 failure samples → only 3 in the dashboard".)

  3. OSD mapping conflict on the aggregate org_email field. The shipped opensearch_dashboards.ndjson froze a cached field-list where org_email was a text/object conflict (plus stale org_email.#text* subfields) — from a cluster that indexed a langAttrString email dict before the parser unwrapped it. org_email is Text() and the parser now unwraps dict emails, so live data is clean; cleared the frozen conflict + artifacts. (Shipped in 9.11.2.)

Bug fixes 1–2 have regression tests in tests/test_cli.py.

Completing the (unreleased) PostgreSQL feature

  • Added postgresql to _KNOWN_SECTIONS so PARSEDMARC_POSTGRESQL_* env vars and their _FILE Docker-secret variants resolve like every other backend (was silently ignored). New in 10.0.0, so documented under the PostgreSQL enhancement, not Bug fixes. Has a test.

Dev-stack audit follow-ups (bootstrap + compose)

  • OpenSearch 2.x → 3.x (unsupported cross-major with OSD 3.x); validated on 3.6.0.
  • PostgreSQL added to the dev stack: service + bootstrap wiring (wait, seed via PARSEDMARC_POSTGRESQL_*, RESEED wipe, Grafana grafana-postgresql-datasource, PG dashboard import). Seeding is gated on psycopg being importable — parsedmarc exit(1)s during client init if a configured backend can't load, which would otherwise zero every backend.
  • Grafana password env fix: container got GRAFANA_PASSWORD, which Grafana ignores; it reads GF_SECURITY_ADMIN_PASSWORD.

Docs

  • PostgreSQL ships a premade Grafana dashboard, so it now sits on the "premade dashboards" features bullet (README + docs/source/index.md).

Validation (rebuilt stack)

Check Result
OSD import → global tenant (OS 3.6.0) 26 objects / 3 dashboards
Aggregate index pattern org_email text, no conflict
Failure docs indexed (dmarc_f*) 4 (Netease/cardinal.com included)
PostgreSQL rows aggregate 11, failure 3, smtp_tls 4
Grafana → PostgreSQL datasource health OK – Database Connection OK
Full test suite 628 passed

🤖 Generated with Claude Code

dashboard-dev-bootstrap.sh sent `securitytenant: global_tenant`. The
OpenSearch security plugin reads that header as a tenant *name*, and
`global_tenant` is a sample custom tenant from the security demo config
-- not the shared Global tenant, whose token is the literal `global`.
The import therefore landed in a separate `global_tenant` tenant (its
own `.kibana_<hash>_globaltenant_1` index) and the dashboards were
invisible to anyone viewing the Global tenant in OpenSearch Dashboards.

Verified against the live dev cluster: `_find` under `securitytenant:
global` returned 26 objects and `.kibana_1` (the Global tenant index the
UI reads) went from 2 to 67 docs after re-importing with the fix. An
empty/omitted header read 0 from Global -- it falls back to the user's
configured default tenant -- so `global` is the only reliable token.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.58%. Comparing base (411f5a8) to head (4b5fc72).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #781      +/-   ##
==========================================
+ Coverage   85.54%   85.58%   +0.03%     
==========================================
  Files          17       17              
  Lines        4628     4633       +5     
==========================================
+ Hits         3959     3965       +6     
+ Misses        669      668       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

seanthegeek and others added 3 commits May 21, 2026 14:50
The CLI expanded every file argument with glob(), which treats [, ], *,
and ? as pattern syntax. A literal path like
"[Netease DMARC Failure Report] Rent Reminder.eml" -- the bracketed shape
many providers use for emailed failure reports -- was read as a character
class, matched nothing, and was dropped before reaching the parser, with
no error. File arguments that exist on disk are now taken literally; only
non-existent paths are globbed, so shell-style wildcards still expand.

Also adds "postgresql" to _KNOWN_SECTIONS so PARSEDMARC_POSTGRESQL_* env
vars (and their _FILE Docker-secret variants) resolve like every other
backend -- the PostgreSQL backend is new in 10.0.0, so this completes the
unreleased feature rather than fixing a released regression, and is
documented under the PostgreSQL enhancement, not Bug fixes.

Regression tests added for both. Verified end-to-end: all four
samples/failure/*.eml now index (the bracketed Netease report included).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dev stack

The dev stack ran OpenSearch Dashboards 3.x against OpenSearch 2.x, an
unsupported cross-major pairing. Bump opensearch to :3 (validated on
3.6.0: OSD import into the Global tenant and all dashboards work).

Add a postgresql service plus bootstrap wiring so the new PostgreSQL
backend is exercised alongside the others: wait for PG, seed it via
PARSEDMARC_POSTGRESQL_* env vars on the same parsedmarc run, wipe it on
RESEED, create a Grafana grafana-postgresql-datasource (uid dmarc-pg),
and import dashboards/grafana/Grafana-DMARC_Reports-PostgreSQL.json.

PG seeding is gated on psycopg being importable: parsedmarc aborts the
whole run (exit 1, nothing written to any backend) when a configured
output backend can't initialize, so wiring in PG without the optional
extra would silently zero ES/OS/Splunk too. When psycopg is absent the
script warns and skips PG, leaving the other backends seeded.

Also fix the Grafana admin password env: the container was given
GRAFANA_PASSWORD, which Grafana ignores -- it reads
GF_SECURITY_ADMIN_PASSWORD. Defaults to admin to match the script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PostgreSQL ships a premade Grafana dashboard
(dashboards/grafana/Grafana-DMARC_Reports-PostgreSQL.json), so it belongs
on the "for use with premade dashboards" bullet alongside Elasticsearch,
OpenSearch, and Splunk rather than on the plain-output-destinations line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seanthegeek seanthegeek changed the title fix: import OpenSearch dashboards into the real Global tenant fix: OSD Global-tenant import + dropped report files with glob metacharacters; validate dev stack on OpenSearch 3.x with PostgreSQL May 21, 2026
seanthegeek and others added 2 commits May 21, 2026 15:00
The aggregate index pattern in dashboards/opensearch/opensearch_dashboards.ndjson
shipped a cached field-list snapshot where org_email was a text/object
conflict, plus leftover org_email.#text and org_email.#text.keyword
subfields. Those came from a cluster that had indexed a langAttrString
email dict ({"#text": ..., "@lang": ...}) before the parser unwrapped it.

org_email is mapped as Text() and parse_aggregate_report_xml now unwraps a
dict email to a plain string, so current data is consistently text -- a
clean cluster's _field_caps reports no conflict. Cleared the frozen
conflict and the two artifact subfields, leaving org_email (text) and
org_email.keyword, matching the live mapping.

Verified: re-importing the corrected ndjson yields an index pattern with
org_email as a plain text field and zero conflicts; only the aggregate
index-pattern line changed, all other saved objects byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samples/aggregate/rfc9990-sample.xml and rfc9990-example.net!...xml were
not in the bootstrap's SAMPLE_FILES, so the dev stack only ever indexed
RFC 7489 reports and the new DMARCbis fields (np, testing,
discovery_method, generator, xml_namespace) never appeared in the
OpenSearch/Kibana indices or were available to the dashboards.

Added both samples (one declares the urn:ietf:params:xml:ns:dmarc-2.0
namespace, the other is namespaceless RFC 9990-shaped, covering both
detection paths). Verified the seeded data now carries np/testing/
discovery_method/generator and xml_namespace=urn:ietf:params:xml:ns:dmarc-2.0;
OpenSearch Dashboards surfaces them on an index-pattern field-list refresh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The seed previously required parsedmarc to be pre-installed and only
warned-and-skipped PostgreSQL when psycopg was missing. Resolve the seed
environment by precedence instead:

  1. explicit PARSEDMARC_BIN  -> used as-is, nothing installed
  2. active $VIRTUAL_ENV
  3. existing repo venv/ or .venv/
  4. otherwise create $REPO_ROOT/venv

For cases 2-4, run `pip install -e .[postgresql]` only when the CLI or
psycopg is missing, so the dev stack can populate Postgres out of the box
without a manual install step. The explicit-PARSEDMARC_BIN path is left
untouched (and the psycopg seed guard still warns/skips if that env lacks
the extra).

Verified: a RESEED run resolves the active venv, seeds ES/OS/Splunk/PG
including the RFC 9990 fields, with no output-client errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@seanthegeek seanthegeek merged commit 180fc58 into master May 21, 2026
12 checks passed
@seanthegeek seanthegeek deleted the fix-osd-bootstrap-global-tenant branch May 21, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant