Skip to content

WGCZ extractors: negative test for non-http(s) HLS URL rejection (xnxx + xvideos) #262

@crippledgeek

Description

@crippledgeek

Scope

Surfaced by the security-reviewer subagent during pre-push review of PR #261 (xvideos HLS fragments migration) and applies equally to PR #260 (xnxx). Severity: LOW (test-coverage gap, not a runtime defect).

Background

WgczNetworkBase::extract_format_urls applies require_http_scheme to every captured URL, so a script-injected javascript: or file:// URI in setVideoHLS(...) is filtered out before reaching Format construction. When the HLS URL is rejected, fmt_urls.hls is None and the HLS row is never emitted — build_info (xvideos) / extract (xnxx) returns RdlpError::Extraction { "no formats found" } before expand_hls_in_place is ever called.

The behaviour is safe by construction, but neither xnxx nor xvideos has a test that locks down the contract at the wiring layer. If someone refactored require_http_scheme away or changed WgczFormatUrls to surface non-http(s) URLs, the regression would only fail in a security audit, not in CI.

Acceptance criteria

  • Add a wiring-layer test in crates/rdlp-extractor/src/extractors/xnxx/mod.rs asserting that an HTML fixture whose setVideoHLS(...) contains a file:///etc/passwd (or javascript:alert(1), or any non-http(s)) URL produces Err(RdlpError::Extraction { ... }) from extract, and that expand_hls_in_place is never reached.
  • Same test in crates/rdlp-extractor/src/extractors/xvideos/mod.rs against build_info.
  • Cheaper alternative: a unit test of WgczNetworkBase::extract_format_urls itself confirming require_http_scheme rejects file:// / javascript: schemes for setVideoHLS / setVideoUrlLow / setVideoUrlHigh — covers both extractors with one fixture.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions