Skip to content
This repository was archived by the owner on May 21, 2026. It is now read-only.

fix: handle malformed og:url protocols missing colon#487

Open
rmdes wants to merge 1 commit into
milanmdev:mainfrom
rmdes:fix/url-malformed-link
Open

fix: handle malformed og:url protocols missing colon#487
rmdes wants to merge 1 commit into
milanmdev:mainfrom
rmdes:fix/url-malformed-link

Conversation

@rmdes
Copy link
Copy Markdown

@rmdes rmdes commented Jan 11, 2026

Some websites return og:url values like 'https//example.com' (missing
colon after protocol). These get treated as relative URLs and
concatenated with the base URL, creating broken links.

This fix:

  • Adds fixMalformedUrl() to correct 'https//' -> 'https://'
  • Improves URL validation regex to require proper protocol
  • Validates the corrected URL before using it

Problem

Some websites have malformed og:url meta tags like https//example.com (missing the colon after the protocol). When the Open Graph scraper encounters this, it treats it as a relative URL and resolves it against the base URL, creating broken links like:

https://www.lecho.behttps//www.lecho.be/path

Example of affected post: https://bsky.app/profile/polbegov.skyfleet.blue/post/3maw2bc3deb25

Root Cause

The malformed og:url (e.g., https//www.lecho.be/path) doesn't match the protocol pattern, so URL resolution treats it as a relative path and prepends the base domain.

Solution

  1. Added fixMalformedUrl() function that corrects common protocol typos:

    • https//https://
    • http//http://
  2. Improved the URL validation regex:

    • Now requires a proper protocol (was previously optional)
    • Validates the corrected URL, falling back to the original RSS URL if still invalid

  Some websites return og:url values like 'https//example.com' (missing
  colon after protocol). These get treated as relative URLs and
  concatenated with the base URL, creating broken links.

  This fix:
  - Adds fixMalformedUrl() to correct 'https//' -> 'https://'
  - Improves URL validation regex to require proper protocol
  - Validates the corrected URL before using it
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant