fix: generation pipeline bugs — broken related-pages links, collapsed callouts, unrendered markdown by myakove · Pull Request #56 · myk-org/docsfy

myakove · 2026-04-17T22:52:52Z

Fixes #55

Changes

Bug 1: Related Pages links all `href="#"` (High)

The HTML sanitizer in renderer.py blocked relative URLs like page-slug.html because they didn't match allowed schemes (http://, https://, #, /, mailto:).

Fix: Added _is_safe_url() helper that allows scheme-less relative URLs while blocking dangerous schemes (javascript:, data:, protocol-relative //evil.com, HTML entity-encoded colons like javascript:alert(1)).

Bug 2: Adjacent blockquote callouts collapse (Medium)

Consecutive > callouts (Note/Warning/Tip) merged into a single blockquote, losing severity styling.

Fix: Added separate_adjacent_callouts() in postprocess.py that detects adjacent callouts with different prefixes and inserts blank line separators. Handles both backtick and tilde code fences.

Bug 3: Markdown inside `<details>` not rendered (Medium)

The Python markdown library can't parse Markdown inside raw HTML blocks, so **bold** appeared literally.

Fix:

Updated all AI writing rules to forbid <details>/<summary> tags (shared _NO_HTML_DETAILS constant)
Added convert_details_to_headings() post-processor to convert any remaining <details> blocks to ## headings
Fence-aware: skips code blocks using _CODE_BLOCK_RE.split()

Wiring

Both post-processors applied in api/projects.py before render_site().

Files Changed

src/docsfy/renderer.py — URL sanitizer fix
src/docsfy/postprocess.py — Two new post-processing functions
src/docsfy/prompts.py — Updated AI writing rules
src/docsfy/api/projects.py — Pipeline wiring

Testing

All 376 tests pass
Reviewed by 3 internal reviewers + Cursor peer review

Summary by CodeRabbit

Bug Fixes
- Prevented adjacent callout blockquotes from collapsing by inserting blank lines between them; this separation is applied during final site rendering.
- Improved URL sanitization to reject unsafe or protocol‑relative URLs for safer links.
New Features
- Support for both backtick (```) and tilde (~~~) fenced code blocks when processing content.
- Automatic conversion of HTML details/summary into Markdown headings.
Documentation
- Writing prompts updated to forbid HTML details so content uses regular headings.

coderabbitai · 2026-04-17T22:53:00Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5777990a-ed31-443e-a89b-8566783fdae0

📥 Commits

Reviewing files that changed from the base of the PR and between c7ef2ae and 4d1c643.

📒 Files selected for processing (4)

src/docsfy/api/projects.py
src/docsfy/postprocess.py
src/docsfy/prompts.py
src/docsfy/renderer.py

Walkthrough

Adds two Markdown postprocessing steps (separate adjacent callouts; convert HTML <details> into headings) into the page-generation pipeline, expands code-fence recognition to include ~~~ fences, updates prompts to forbid HTML details, and tightens URL sanitization used during rendering.

Changes

Cohort / File(s)	Summary
Pipeline Integration `src/docsfy/api/projects.py`	Apply `separate_adjacent_callouts()` then `convert_details_to_headings()` to each page's Markdown immediately before `render_site(...)`, replacing the previous `pages` input to the renderer.
Post-processing Utilities `src/docsfy/postprocess.py`	Added `separate_adjacent_callouts(md_text: str)` to insert blank lines between adjacent callout blockquotes and `convert_details_to_headings(md_text: str)` to convert `<details><summary>` into `##` headings; extended `_CODE_BLOCK_RE` to recognize `~~~` fences and added regexes for callouts/details.
Prompt Rules `src/docsfy/prompts.py`	Added `_NO_HTML_DETAILS` prompt fragment and appended it to guide/recipe/reference/concept writing rules and incremental-update templates to forbid HTML `<details>/<summary>`.
Renderer URL Sanitization `src/docsfy/renderer.py`	Introduced `_is_safe_url()` helper and unified quoted/unquoted `href

Sequence Diagram(s)

sequenceDiagram
    participant Generator as Page Generator
    participant PostProc as Post-processors
    participant Renderer as Site Renderer

    rect rgba(100, 150, 200, 0.5)
    Note over Generator,Renderer: Generation → Postprocess → Render pipeline
    end

    Generator->>Generator: generate & validate pages (slug→content)
    Generator->>PostProc: pass pages dict
    PostProc->>PostProc: for each page: separate_adjacent_callouts(content)
    PostProc->>PostProc: then convert_details_to_headings(content)
    PostProc->>Renderer: return transformed pages dict
    Renderer->>Renderer: sanitize URLs via _is_safe_url() and render HTML

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

fix: three generation pipeline bugs — broken related-pages links, collapsed callouts, unrendered markdown in HTML blocks #55 — Implements post-processing to separate collapsed callouts and converts HTML <details> to headings, addressing the behaviors reported in the issue.

Suggested labels

size/XL

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and specifically summarizes the main changes: fixing three critical generation pipeline bugs (broken related-pages links, collapsed callouts, unrendered markdown) across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/issue-55-generation-pipeline-bugs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

myakove-bot · 2026-04-17T22:53:21Z

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
Issue Creation: Disabled for this repository
Branch Labeling: Branch-specific labels are applied to track the target branch
Auto-verification: Auto-verified users have their PRs automatically marked as verified
Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

/wip - Mark PR as work in progress (adds WIP: prefix to title)
/wip cancel - Remove work in progress status
/hold - Block PR merging (approvers only)
/hold cancel - Unblock PR merging
/verified - Mark PR as verified
/verified cancel - Remove verification status
/reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
/regenerate-welcome - Regenerate this welcome message

Review & Approval

/lgtm - Approve changes (looks good to me)
/approve - Approve PR (approvers only)
/automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
/assign-reviewers - Assign reviewers based on OWNERS file
/assign-reviewer @username - Assign specific reviewer
/check-can-merge - Check if PR meets merge requirements

Testing & Validation

/retest tox - Run Python test suite with tox
/retest build-container - Rebuild and test container image
/retest python-module-install - Test Python package installation
/retest all - Run all available tests

Container Operations

/build-and-push-container - Build and push container image (tagged with PR number)
- Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

/cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
- Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

/<label-name> - Add a label to the PR
/<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

Approval: /approve from at least one approver
Status Checks: All required status checks must pass
No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

myakove

Reviewers:

myakove

Available Labels

hold
verified
wip
lgtm
approve
automerge

AI Features

Conventional Title: Mode: fix (claude/claude-opus-4-6[1m])
Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])

💡 Tips

WIP Status: Use /wip when your PR is not ready for review
Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
Cherry-picking: Cherry-pick labels are processed when the PR is merged
Container Builds: Container images are automatically tagged with the PR number
Permission Levels: Some commands require approver permissions
Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/docsfy/renderer.py`:
- Around line 63-86: Normalize and validate the URL by decoding HTML entities
first, then stripping whitespace, and run all checks against this normalized
value in _is_safe_url; specifically call _html_mod.unescape on the raw url, then
.strip() the result, then (1) reject protocol-relative URLs by checking
normalized.startswith("//"), (2) allow absolute paths with
normalized.startswith("/") but not "//", (3) check allowed schemes using
case-insensitive comparison (lowercase the prefix or parsed.scheme) for
("http://","https://","mailto:","#"), and (4) use urllib.parse.urlsplit on the
normalized value to determine if a scheme exists and treat no-scheme (relative)
as safe; ensure all earlier startswith checks use the normalized string so
whitespace/entity-encoded bypasses are prevented.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f89eef42-1d1d-4318-be1f-e70d557292e4

📥 Commits

Reviewing files that changed from the base of the PR and between 7c695c4 and 5810d3d.

📒 Files selected for processing (4)

src/docsfy/api/projects.py
src/docsfy/postprocess.py
src/docsfy/prompts.py
src/docsfy/renderer.py

myakove · 2026-04-18T00:03:59Z

@coderabbitai review

coderabbitai · 2026-04-18T00:04:05Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/docsfy/postprocess.py`:
- Around line 56-91: The adjacent-callout logic incorrectly treats any prior '>'
line as the start of a callout, causing inserts between a multi-line callout
body and its continuation; update the block in the loop that currently checks
lines[prev_idx].strip().startswith(">") so it only treats the previous non-empty
blockquote as an adjacent callout if that prior line actually matches
_CALLOUT_RE (i.e., use _CALLOUT_RE.match on the trimmed previous line instead of
a plain startswith(">")); keep the existing behaviour of removing blank lines
via result.pop() and appending separators when a true callout adjacency is
detected, and also replace the two startswith checks for backticks/ tildes with
a single startswith(("```","~~~")) call.
- Around line 28-38: The current regexes are too strict: change _DETAILS_OPEN_RE
to allow any content (including HTML) inside <summary> by matching non-greedily
up to the literal </summary> (e.g. use a pattern like
r"<details[^>]*>\s*<summary\b[^>]*?>([\s\S]*?)</summary>" with re.IGNORECASE) so
summaries containing tags like <strong> match, and relax _DETAILS_CLOSE_RE to
match a closing </details> anywhere (e.g. r"</details>" with re.IGNORECASE)
instead of requiring it on its own line; keep the existing substitution order
(run the open-summary replacement first and then the close replacement, as the
current .sub("\n", ...) call site expects) so you don’t accidentally consume
unrelated whitespace.

In `@src/docsfy/renderer.py`:
- Around line 63-81: The _is_safe_url function is already handling HTML-decoding
before stripping and performing a case-insensitive scheme check so no functional
change is required; leave _is_safe_url as implemented (including the checks for
"http://", "https://", "#", "mailto:", leading "/" vs "//", and the final
urllib.parse.urlsplit-based scheme detection). If you prefer a tiny style tweak
(optional), you can collapse the last conditional by returning not parsed.scheme
instead of the explicit if/return, but this is purely stylistic and not
required.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 10ffc358-d0a3-4952-96bb-7ccce1dfa217

📥 Commits

Reviewing files that changed from the base of the PR and between 5810d3d and dce6a4e.

📒 Files selected for processing (4)

src/docsfy/api/projects.py
src/docsfy/postprocess.py
src/docsfy/prompts.py
src/docsfy/renderer.py

myakove · 2026-04-18T01:17:39Z

@coderabbitai review

coderabbitai · 2026-04-18T01:17:45Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/docsfy/postprocess.py`:
- Around line 32-35: _REGEX _DETAILS_CLOSE_RE currently consumes surrounding
whitespace around the closing </details> tag and replacement code replaces the
whole match with a newline, collapsing blank lines; change the regex to only
match the tag (e.g. re.compile(r"</details>", re.IGNORECASE)) so surrounding
newlines/spaces are preserved, and ensure the code that removes the tag (the
re.sub call that currently uses _DETAILS_CLOSE_RE) replaces the match with an
empty string (not a newline); apply the same change to the other similar
match/replacement pair referenced in the comment.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3fba3344-0347-4387-b06c-886d7691a32a

📥 Commits

Reviewing files that changed from the base of the PR and between dce6a4e and c7ef2ae.

📒 Files selected for processing (4)

src/docsfy/api/projects.py
src/docsfy/postprocess.py
src/docsfy/prompts.py
src/docsfy/renderer.py

… callouts, unrendered markdown in HTML blocks Fixes #55 - Allow relative URLs in HTML sanitizer while blocking dangerous schemes (javascript:, data:, protocol-relative //evil.com, entity-encoded colons) - Add separate_adjacent_callouts() to split merged Note/Warning/Tip callouts - Add convert_details_to_headings() to convert <details> blocks to ## headings - Update AI prompts to forbid <details>/<summary> tags in all page types - Apply post-processing before render_site in generation pipeline

myakove · 2026-04-18T02:18:17Z

@coderabbitai review

coderabbitai · 2026-04-18T02:18:23Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

myakove-bot · 2026-04-18T09:59:51Z

New container for ghcr.io/myk-org/docsfy:latest published

myakove-bot assigned myakove Apr 17, 2026

myakove-bot added branch-main size/L labels Apr 17, 2026

myakove force-pushed the fix/issue-55-generation-pipeline-bugs branch from fa47cde to 5810d3d Compare April 17, 2026 23:03

coderabbitai bot requested changes Apr 17, 2026

View reviewed changes

Comment thread src/docsfy/renderer.py Outdated

myakove-bot added the changes-requested-coderabbitai[bot] label Apr 17, 2026

myakove force-pushed the fix/issue-55-generation-pipeline-bugs branch from 5810d3d to dce6a4e Compare April 17, 2026 23:57

myakove commented Apr 17, 2026

View reviewed changes

Comment thread src/docsfy/renderer.py Outdated

coderabbitai bot approved these changes Apr 17, 2026

View reviewed changes

myakove-bot added commented-myakove lgtm-coderabbitai[bot] commented-coderabbitai[bot] and removed changes-requested-coderabbitai[bot] labels Apr 17, 2026

coderabbitai bot requested changes Apr 18, 2026

View reviewed changes

Comment thread src/docsfy/postprocess.py

Comment thread src/docsfy/postprocess.py

Comment thread src/docsfy/renderer.py Outdated

myakove-bot added the changes-requested-coderabbitai[bot] label Apr 18, 2026

myakove force-pushed the fix/issue-55-generation-pipeline-bugs branch from dce6a4e to c7ef2ae Compare April 18, 2026 00:13

coderabbitai bot approved these changes Apr 18, 2026

View reviewed changes

myakove-bot added commented-coderabbitai[bot] commented-myakove lgtm-coderabbitai[bot] and removed lgtm-coderabbitai[bot] commented-coderabbitai[bot] commented-myakove changes-requested-coderabbitai[bot] labels Apr 18, 2026

coderabbitai bot requested changes Apr 18, 2026

View reviewed changes

Comment thread src/docsfy/postprocess.py

myakove-bot added the changes-requested-coderabbitai[bot] label Apr 18, 2026

myakove force-pushed the fix/issue-55-generation-pipeline-bugs branch from c7ef2ae to 4d1c643 Compare April 18, 2026 01:28

coderabbitai bot approved these changes Apr 18, 2026

View reviewed changes

myakove-bot added commented-myakove lgtm-coderabbitai[bot] commented-coderabbitai[bot] and removed changes-requested-coderabbitai[bot] commented-coderabbitai[bot] commented-myakove lgtm-coderabbitai[bot] labels Apr 18, 2026

myakove merged commit 824837e into main Apr 18, 2026
5 of 7 checks passed

myakove deleted the fix/issue-55-generation-pipeline-bugs branch April 18, 2026 09:56

Conversation

myakove commented Apr 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Bug 1: Related Pages links all href="#" (High)

Bug 2: Adjacent blockquote callouts collapse (Medium)

Bug 3: Markdown inside <details> not rendered (Medium)

Wiring

Files Changed

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

myakove-bot commented Apr 17, 2026

Welcome! 🎉

🔄 Automatic Actions

📋 Available Commands

PR Status Management

Review & Approval

Testing & Validation

Container Operations

Cherry-pick Operations

Label Management

✅ Merge Requirements

📊 Review Process

💡 Tips

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

myakove commented Apr 18, 2026

Uh oh!

coderabbitai bot commented Apr 18, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

myakove commented Apr 18, 2026

Uh oh!

coderabbitai bot commented Apr 18, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

myakove commented Apr 18, 2026

Uh oh!

coderabbitai bot commented Apr 18, 2026

Uh oh!

Uh oh!

myakove-bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

myakove commented Apr 17, 2026 •

edited by coderabbitai bot

Loading

Bug 1: Related Pages links all `href="#"` (High)

Bug 3: Markdown inside `<details>` not rendered (Medium)

coderabbitai bot commented Apr 17, 2026 •

edited

Loading