Skip to content

fix for #727; split up the list handling between base and LaTeX pu…#775

Open
opt12 wants to merge 1 commit into
doorstop-dev:developfrom
opt12:issue_727_fix
Open

fix for #727; split up the list handling between base and LaTeX pu…#775
opt12 wants to merge 1 commit into
doorstop-dev:developfrom
opt12:issue_727_fix

Conversation

@opt12

@opt12 opt12 commented Jun 9, 2026

Copy link
Copy Markdown

Fixes #727

Issue #727 Fix - Description of Changes

Summary

Fixed nested list handling in both LaTeX and HTML publishers. Lists with 2-space indentation and variable indentation depths now render correctly. Missing blank lines around lists are now quietly added so that it "just works".

Root Cause

  1. LaTeX Publisher: Strict indentation validation prevented flexible (but valid) Markdown list indentation
  2. HTML Publisher: Python-Markdown requires 4-space indentation for nested lists, but Doorstop documents commonly use 2-space indentation
  3. Both: Lists without blank lines around the list were incorrectly merged into single blocks

Changes Made

1. doorstop/core/publishers/base.py

Added two new methods to BasePublisher:

  • _normalize_list_indentation(text):

    • Analyzes list structure using a stack-based algorithm to determine hierarchy
    • Groups consecutive list items into blocks (separated by non-list content)
    • Normalizes indentation to 4-space standard required by Python-Markdown
    • Handles inconsistent indentation (1, 2, 4, 6 spaces) gracefully
  • _fix_list_spacing(text):

    • Adds blank lines before and after list blocks
    • Required by Markdown specification for proper list recognition
    • Prevents double blank lines

process_lists() and _check_for_list_end() were pulled to doorstop/core/publishers/latex.py as it is not needed for Mrakdown and HTML publishing.

Rationale: Central implementation in base class allows code reuse across all publishers.

2. doorstop/core/publishers/html.py

Modified lines() method:

  • Calls _normalize_list_indentation() before markdown processing
  • Calls _fix_list_spacing() to ensure proper spacing
  • Removes manual list processing (markdown.markdown() handles it after normalization)

Rationale: HTML publisher relies on Python-Markdown which needs 4-space indentation. Normalization ensures compatibility.

3. doorstop/core/publishers/latex.py

Modified process_lists() and _check_for_list_end():

  • Stack-based depth tracking: Added self.list["stack"] to track actual nesting levels
  • Removed strict indentation validation: Allows flexible indentation (variable spaces per level)
  • Fixed end-tag generation: Properly closes all nesting levels at list end

Key changes:

  • process_lists(): Uses stack to track indentation hierarchy instead of arithmetic
  • _check_for_list_end(): Uses stack length to determine number of closing tags needed

Rationale: LaTeX publisher processes raw Markdown and must handle variable indentation common in user documents.

4. Tests

Added:

  • test_base_list_normalization.py: Tests for normalization and spacing functions (42 tests)
  • test_html_list_handling.py: HTML-specific list rendering tests (9 tests)

Modified:

  • test_publisher_latex_environments.py: Changed test_missing_changing_list_indentation from expecting error to expecting success (flexible indentation is now allowed)
  • Added test_flexible_indentation_complex for complex nesting scenarios

Technical Details

Stack-Based Hierarchy Detection

# Before: Arithmetic comparison (brittle)
if depth + indent_step != new_indent: raise Error

# After: Stack-based (flexible)
while stack and stack[-1] >= indent:
    stack.pop()
level = len(stack)

Normalization Algorithm

  1. Parse all list items with their original indentation
  2. Group into blocks (separated by non-list lines)
  3. For each block, build hierarchy using stack
  4. Map hierarchy levels to 4-space increments

Block Separation

Lists separated by non-list content (headings, paragraphs) are normalized independently, preventing incorrect merging.

Compatibility

  • ✅ Existing documents with 4-space indentation: Unchanged
  • ✅ Documents with 2-space indentation: Now work correctly
  • ✅ Mixed indentation (1, 2, 4, 6 spaces): Normalized consistently
  • ✅ All existing tests pass (except one test that expected error for flexible indentation)

Benefits

  1. User-friendly: Accepts common Markdown indentation patterns (2 or 4 spaces)
  2. Robust: Handles edge cases like missing blank lines, variable indentation
  3. Maintainable: Central implementation in base class
  4. Well-tested: 51 new tests covering edge cases

@opt12 opt12 changed the title - fixe for #727; split up the list handling between base and LaTeX pu… fix for #727; split up the list handling between base and LaTeX pu… Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTML publishing of lists gets weird under some circumstances

1 participant