Skip to content

BACK-475 - Add-Word-(docx)-upload-to-enable-image-extraction-for-pasted-Word-content#648

Open
kuwork wants to merge 3 commits into
MrLesk:mainfrom
kuwork:docx-upload
Open

BACK-475 - Add-Word-(docx)-upload-to-enable-image-extraction-for-pasted-Word-content#648
kuwork wants to merge 3 commits into
MrLesk:mainfrom
kuwork:docx-upload

Conversation

@kuwork
Copy link
Copy Markdown
Contributor

@kuwork kuwork commented May 11, 2026

What

Allow users to upload Word documents (.docx) directly into the Web UI editor. The backend extracts text and images, converting them to Markdown with proper image references.

Why

The existing paste-as-markdown feature (BACK-208) cannot extract images from pasted Word content because browser clipboard APIs don't expose embedded images as extractable blobs. By supporting direct .docx file upload, mammoth can read the docx archive and extract embedded images to the temp assets directory.

Changes

  • Backend (src/core/docx-converter.ts): New module using mammoth to convert .docx → HTML. Embedded images are extracted via mammoth's convertImage callback and uploaded to backlog/assets/.temp/ via AssetManager.
  • Backend (src/server/index.ts): New POST /api/docx/convert endpoint. Accepts multipart/form-data, validates .docx extension, returns { html, images, messages }.
  • Frontend (src/web/components/PasteAwareMDEditor.tsx): Added Word upload button to editor toolbar (extraCommands), drag-and-drop support, and a hidden file picker. Uploads file to backend, then runs cleanHtml + Turndown in the browser to produce Markdown.
  • Frontend (src/web/utils/paste-as-markdown.ts): Extracted cleanHtml as an exported async function with a new keepMedia option. This allows the docx upload path to preserve server-side extracted images while the paste path continues to filter invalid local images.
  • Frontend (src/web/lib/api.ts): Added convertDocx() API client method.
  • Tests (src/test/server-docx-convert.test.ts): Integration tests for the conversion endpoint (validation, conversion, image extraction to temp directory).
  • Dependencies: Added mammoth for docx parsing.

How it works

  1. User clicks the Word icon in the editor toolbar or drags a .docx file onto the editor.
  2. Frontend uploads the file to POST /api/docx/convert.
  3. Backend uses mammoth to convert docx → HTML. During conversion, embedded images are extracted and saved to .temp/ with UUID filenames.
  4. Backend returns { html, images, messages }.
  5. Frontend calls cleanHtml(html, { keepMedia: true }) to normalize Word HTML (flatten table cells, convert mso-lists, strip classes, etc.) while preserving <img> tags.
  6. Frontend runs Turndown + post-processing to get clean Markdown.
  7. Markdown is inserted at the current cursor position in the editor.
  8. When the user saves, the existing POST /api/assets/promote flow promotes temp images to the permanent paste directory.

Testing

  • bun test src/test/server-docx-convert.test.ts — backend endpoint tests (4 pass)
  • bun test src/test/build.test.ts — CLI compile still works (no jsdom in bundle)
  • bunx tsc --noEmit — type check passes

closes BACK-475

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant