Skip to content

security/oidc: route discovery and JWKS through a forward proxy#30268

Open
pgellert wants to merge 5 commits intoredpanda-data:devfrom
pgellert:feat/http-proxy-2
Open

security/oidc: route discovery and JWKS through a forward proxy#30268
pgellert wants to merge 5 commits intoredpanda-data:devfrom
pgellert:feat/http-proxy-2

Conversation

@pgellert
Copy link
Copy Markdown
Contributor

@pgellert pgellert commented Apr 23, 2026

Adds forward-proxy support for OIDC discovery and JWKS fetches so Redpanda brokers in corporate-proxy environments can reach external IdPs (Azure AD / Entra ID, Okta, Keycloak behind a proxy, etc.).

The transport layer (net::base_transport) learns HTTP CONNECT tunneling via an optional proxy_config on its configuration struct. A new cluster config oidc_http_proxy_url is the only opt-in caller in v1: OIDC's
discovery and JWKS fetches route through the proxy when it's set. Both http:// and https:// proxy URL schemes are supported; origins must be https:// (CONNECT tunneling requires a TLS origin, and
plaintext-origin proxying would need absolute-form HTTP request rewriting per RFC 9112 §3.2.2, not implemented). The bad combination of oidc_http_proxy_url set with a non-https oidc_discovery_url is rejected at
cluster-config commit time via a cross-field validator in config::validate_oidc_http_proxy, plus a runtime backstop in oidc_service::make_request for the bootstrap-config path that bypasses the admin API.

Review guidance

This is a good intro/recap of how forward proxies work: https://docs.mitmproxy.org/stable/concepts/how-mitmproxy-works/

Fixes https://redpandadata.atlassian.net/browse/CORE-16095

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v26.1.x
  • v25.3.x
  • v25.2.x

Release Notes

Features

  • New cluster config oidc_http_proxy_url routes OIDC discovery and JWKS fetches through an HTTP forward proxy. Set to a URL of the form http://host:port or https://host:port to enable; leave unset (the default) to connect to the OIDC endpoint directly. When set, oidc_discovery_url must use https:// as plaintext OIDC endpoints through a forward proxy are not supported. The property is live-reloadable (no broker restart required).

@pgellert pgellert requested a review from a team April 23, 2026 15:55
@pgellert pgellert self-assigned this Apr 23, 2026
@pgellert pgellert requested review from Copilot and nguyen-andrew and removed request for a team April 23, 2026 15:55
@pgellert pgellert requested review from a team as code owners April 23, 2026 15:55
@pgellert pgellert requested review from cjayani and removed request for a team April 23, 2026 15:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds forward-proxy (HTTP CONNECT) support for OIDC discovery + JWKS retrieval so brokers can reach external IdPs from corporate-proxy environments, and introduces ducktape coverage to validate proxy routing + misconfiguration rejection.

Changes:

  • Add optional forward-proxy CONNECT tunneling support to net::base_transport and thread it into OIDC HTTP fetches.
  • Introduce new cluster config oidc_http_proxy plus commit-time cross-field validation against plaintext oidc_discovery_url.
  • Add ducktape tests + a MitmproxyService (and docker dependency) to validate HTTP/HTTPS proxy variants and rejection paths.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/rptest/tests/redpanda_oauth_proxy_test.py New ducktape tests for OIDC discovery/JWKS via proxy + rejection behavior
tests/rptest/services/mitmproxy.py New ducktape service wrapper to run mitmproxy as a CONNECT-only forward proxy
tests/docker/ducktape-deps/mitmproxy Installs mitmproxy in the ducktape image for proxy-based tests
tests/docker/Dockerfile Wires mitmproxy dependency install into ducktape image build
src/v/security/oidc_service.h Adds http_proxy binding to OIDC service constructor
src/v/security/oidc_service.cc Routes OIDC discovery/JWKS HTTP client through forward proxy and adds runtime backstop rejection
src/v/redpanda/admin/server.cc Adds cross-field validator call for oidc_http_proxy on config patch
src/v/net/transport.h Adds optional proxy configuration and proxy_connect_error for CONNECT failures
src/v/net/transport.cc Implements proxy connect flow (connect to proxy, optional TLS, CONNECT handshake, then origin TLS)
src/v/config/validators.h Declares validate_oidc_http_proxy
src/v/config/validators.cc Implements commit-time rejection for proxy + plaintext discovery URL
src/v/config/configuration.h Adds oidc_http_proxy property to configuration
src/v/config/configuration.cc Defines oidc_http_proxy property and per-property parsing validator
src/v/cluster/controller.cc Plumbs oidc_http_proxy binding into OIDC service wiring

Comment thread src/v/config/configuration.cc
Comment on lines +491 to +496
vlog(
seclog.debug,
"OIDC: routing request to {} via HTTP proxy {}",
url,
proxy_url_str);
}
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proxy_url_str (sourced from oidc_http_proxy) is logged verbatim and also included in error details. If an operator provides a proxy URL with userinfo (e.g. http://user:pass@proxy:port), this would leak credentials into logs and error messages. Consider either rejecting URLs containing userinfo in validation, or redacting userinfo before logging/formatting error strings.

Copilot uses AI. Check for mistakes.
Comment thread src/v/net/transport.cc Outdated
Comment thread tests/rptest/services/mitmproxy.py Outdated
Comment thread tests/docker/ducktape-deps/mitmproxy Outdated
@pgellert pgellert force-pushed the feat/http-proxy-2 branch from 50ff440 to 6f25e75 Compare April 23, 2026 16:39
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Apr 23, 2026

Retry command for Build#83584

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/cluster_config_test.py::ClusterConfigTest.test_valid_settings

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Apr 23, 2026

CI test results

test results on build#83584
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) AlterConfigMixedNodeTest test_alter_config_shadow_indexing_mixed_node {"incremental_update": false} integration https://buildkite.com/redpanda/redpanda/builds/83584#019dbb4f-787e-4271-b839-7a214179a0ea 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=AlterConfigMixedNodeTest&test_method=test_alter_config_shadow_indexing_mixed_node
FAIL ClusterConfigTest test_valid_settings null integration https://buildkite.com/redpanda/redpanda/builds/83584#019dbb4f-787d-4112-a3f7-270eec0f9222 0/11 Test FAILS after retries.Significant increase in flaky rate(baseline=0.0000, p0=0.0000, reject_threshold=0.0100) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ClusterConfigTest&test_method=test_valid_settings
FAIL ClusterConfigTest test_valid_settings null integration https://buildkite.com/redpanda/redpanda/builds/83584#019dbb53-900e-4c73-9162-252c4be5a75c 0/11 Test FAILS after retries.Significant increase in flaky rate(baseline=0.0000, p0=0.0000, reject_threshold=0.0100) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ClusterConfigTest&test_method=test_valid_settings
FLAKY(PASS) WriteCachingFailureInjectionE2ETest test_crash_all {"use_transactions": false} integration https://buildkite.com/redpanda/redpanda/builds/83584#019dbb53-900f-48f0-b37b-954c86b671cf 9/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0794, p0=0.5626, reject_threshold=0.0100. adj_baseline=0.2197, p1=0.3193, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all
test results on build#83630
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) ShadowLinkingReplicationTests test_with_restart {"storage_mode": "local"} integration https://buildkite.com/redpanda/redpanda/builds/83630#019dbf80-e1e7-415e-a237-74b8e2df4392 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0149, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingReplicationTests&test_method=test_with_restart
test results on build#83746
test_status test_class test_method test_arguments test_kind job_url passed reason test_history
FLAKY(PASS) ShadowLinkingReplicationTests test_auto_prefix_trimming {"source_cluster_spec": {"cluster_type": "redpanda"}, "storage_mode": "cloud", "with_failures": true} integration https://buildkite.com/redpanda/redpanda/builds/83746#019dd387-9596-4326-8791-9f852f986415 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0008, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingReplicationTests&test_method=test_auto_prefix_trimming
FLAKY(PASS) ShadowLinkingReplicationTests test_replication_with_failures {"storage_mode": "tiered_cloud"} integration https://buildkite.com/redpanda/redpanda/builds/83746#019dd38b-8f7c-49b6-b8b9-61b257ed3fc6 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0014, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingReplicationTests&test_method=test_replication_with_failures

@pgellert
Copy link
Copy Markdown
Contributor Author

force-push: address CI failure (add a _url suffix to the new config) and address copilot code review feedback

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.

Comment thread src/v/net/transport.cc
Comment thread src/v/security/config_rcl.cc Outdated
Comment thread tests/rptest/tests/redpanda_oauth_proxy_test.py
Comment thread tests/rptest/tests/redpanda_oauth_proxy_test.py
Comment thread src/v/config/configuration.cc
Extends base_transport::configuration with an optional proxy_config
sub-struct and a matching proxy_connect_error exception type. When
configuration::proxy is set, do_connect now:

  1. TCP-connects to the proxy (not the origin).
  2. Optionally TLS-wraps the proxy socket for https:// proxies, with
     SNI derived from the proxy address. Skipped for http:// proxies.
  3. Sends an HTTP CONNECT request for the origin; requires a 200
     response. Throws proxy_connect_error on failure, naming the
     proxy and origin for operator-visible diagnostics.
  4. TLS-wraps the tunnelled socket to the origin as today (gated on
     configuration::credentials), with SNI set to the origin hostname.

The ordering is non-commutative and matches RFC 9110 §9.3.6:
  - Plaintext proxy, HTTPS origin: TCP → CONNECT → TLS(origin)
  - TLS proxy, HTTPS origin: TCP → TLS(proxy) → CONNECT → TLS(origin)

The CONNECT response parser bounds both per-line length and total
response-header bytes so a misbehaving proxy cannot force unbounded
allocation on this control-plane path.

The v1 caller is the OIDC service; other callers (RPC, cloud storage,
metrics reporter, etc.) remain unaware of the field.
parse_url silently drops userinfo (user:pass@) when extracting
host/port, so embedded credentials never reach the wire. Rejecting at
the parser covers every oidc URL caller, including oidc_discovery_url —
which previously accepted (and silently stripped) such URLs.
New cluster config consumed in the next commit by the OIDC service
to route discovery and JWKS fetches through an HTTP forward proxy.
The property is live-reloadable (needs_restart::no) and user-visible.

Commit-time validation:

 - per-property: rejects values that do not parse as valid absolute
   URLs.

 - cross-field: validate_oidc_http_proxy_url rejects a PATCH that
   would leave oidc_http_proxy_url set alongside a non-https
   oidc_discovery_url. Follows the existing
   config_multi_property_validation pattern used for cloud storage,
   iceberg REST catalog, partition balancer, and default storage mode.
When oidc_http_proxy_url is set, the OIDC service parses the URL and
attaches it to the base_transport configuration for each discovery /
JWKS fetch. Both http:// and https:// proxy URL schemes are
supported; an https:// proxy triggers a TLS handshake with the proxy
using the OIDC service's system-trust credentials before CONNECT is
sent, with the origin TLS handshake tunnelled inside.

The combination of oidc_http_proxy_url + plaintext oidc_discovery_url
is rejected at request time with a named operator-facing error
(CONNECT tunneling cannot be used for plaintext origins;
RFC 9112 §3.2.2). The commit-time cross-field validator in
config/validators.cc is the primary gate; the runtime rejection here
is defense-in-depth for the bootstrap-config path.

Live-reloadable: changing oidc_http_proxy_url triggers a metadata
refresh without a broker restart.
Five ducktape tests covering the behavioural matrix:

 - OIDCViaProxyTest: http:// proxy + https:// origin, OAUTHBEARER
   end-to-end via mitmproxy. iptables DROP on direct Keycloak egress
   enforces the proxy path.
 - OIDCViaHttpsProxyTest: https:// proxy + https:// origin (nested
   TLS). mitmproxy's regular-mode listener autodetects TLS via
   --certs; the cert is signed by the existing TLSCertManager CA.
 - OIDCProxyRejectsPlaintextOriginTest: runtime rejection via the
   bootstrap-config path.
 - OIDCProxyRejectedAtConfigCommitTest: cross-field validator via
   admin PATCH, both directions (adding proxy while discovery is
   http, and reverting discovery to http while proxy is set).

The two new test files are added to the pyright "standard" list in
type-check-strictness.json, matching the strictness of the sibling
redpanda_oauth_test.py / keycloak.py they inherit from.
@pgellert pgellert force-pushed the feat/http-proxy-2 branch from d818bab to 1a7dc20 Compare April 28, 2026 09:33
@pgellert
Copy link
Copy Markdown
Contributor Author

force-push: address copilot code review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants