Skip to content

feat: qetranslate review CLI for code comments, figure labels, and LaTeX #55

Description

@mmcky

Problem

AI translation often misses or incorrectly handles three specific element types in lectures:

  1. Code comments — inline comments in code cells that should be translated but are frequently left in English or translated incorrectly
  2. Figure labels — axis labels, titles, legends, and annotations in matplotlib/plotly figures that need localisation
  3. LaTeX — display math and inline math containing text strings (e.g. \text{marginal cost}) that should be translated while preserving the math structure

These are easy to miss during bulk translation because they're embedded inside code blocks or math environments that translators (human or AI) tend to skip over.

Proposal

Build a qetranslate review CLI command (or similar) that:

  1. Extracts all code comments, figure labels, and LaTeX text elements from a translated lecture
  2. Presents them one-by-one for review (similar to qebench translate's interactive loop)
  3. Shows source + current translation side by side for each element
  4. Allows quick accept/edit/flag actions
  5. Writes corrections back to the translated file

Element Detection

Element Detection Strategy
Code comments Parse # comments in Python code cells
Figure labels Detect plt.xlabel(), plt.title(), ax.set_ylabel(), legend strings, annotation strings etc.
LaTeX text Extract \text{...}, \mathrm{...}, \textbf{...} and similar commands from math blocks

Workflow

qetranslate review lectures/intro.md --lang zh-cn

Would cycle through each extractable element showing:

[Code Comment] lectures/intro.md:42
  EN: # Calculate the steady state
  ZH: # 计算稳态
  [a]ccept / [e]dit / [s]kip / [f]lag?

Integration with action-translation

  • Flagged items could feed into the translation glossary or prompt instructions
  • Corrections could be committed directly or collected into a PR
  • Could run as a post-translation QA step in the sync workflow

Open Questions

  • Should this live as a subcommand of the existing qetranslate CLI or as a standalone tool?
  • Should it also handle MyST directive options (e.g. :label:, :name:) that contain translatable text?
  • Priority: start with code comments (most common miss) then expand to figures and LaTeX?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions