Skip to content

Add orphaned course-module cleanup (re-queue stalled deletions)#29

Open
jamieburgess wants to merge 1 commit into
catalyst:MOODLE_35_STABLEfrom
jamieburgess:feature/orphan-cm-cleanup
Open

Add orphaned course-module cleanup (re-queue stalled deletions)#29
jamieburgess wants to merge 1 commit into
catalyst:MOODLE_35_STABLEfrom
jamieburgess:feature/orphan-cm-cleanup

Conversation

@jamieburgess

Copy link
Copy Markdown

PR: Add orphan course module detection and re-queue

Title

Add detection and re-queue of orphaned course modules (deletioninprogress=1 with no queued task)

Summary

The Fix Delete Modules plugin currently fixes the issue where task rows in task_adhoc with classname = '\core_course\task\course_delete_modules' and faildelay >= minimumfaildelay and are stuck.

There is a another class of stuck deletion the plugin can't currently see:

A course module is flagged deletioninprogress = 1 in course_modules, but no matching adhoc task is in the queue.

This happens when an adhoc worker is killed mid-execute() (worker timeout, OOM, container restart, AWS-style idle DB-connection drop) before the framework can mark the task as failed. The cm flag persists, the task row is gone, and the activity is permanently stuck showing "Deletion in progress" on the course page with no queued task to ever finish the job.

This PR adds:

  1. A new "Orphaned course modules" section on the existing report page, listing every cm in this state with course/module/activity-name context.
  2. Per-row and bulk Re-queue stalled deletion buttons that call Moodle's standard course_delete_module($cmid, true) helper — same code path as the activity Delete button, all hooks/observers/events fire as normal.
  3. A --requeue-orphans CLI flag for headless / cron-driven recovery.
  4. Phpunit tests for the new data source and surgery method.
  5. Hardening of fix_module.php and separate_module.php to require moodle/site:config (closes a pre-existing gap where any logged-in user with their own sesskey could trigger the existing fix endpoint — not introduced by this PR but worth fixing alongside the new work).

No existing functionality is changed. Catalyst's existing diagnose/separate/force-delete flow is untouched and renders unchanged.

Why

We hit this exact state on AWS production, preprod, and test environments. The plugin's existing report showed "No course_delete_module tasks in queue", This PR closes the gap so future occurrences are one click in the existing UI.

Files changed

New:

  • classes/orphan_module_list.php — data source. Cross-DB-portable: scans course_modules for deletioninprogress=1, walks queued tasks via \core\task\manager::get_adhoc_tasks(...), filters out cmids referenced by any task. No raw JSON_EXTRACT, no string-built SQL.
  • templates/orphan_modules_table.mustache — UI for the new section.
  • tests/orphan_module_list_test.php — 4 tests, 10 assertions.
  • tests/surgeon_orphan_requeue_test.php — 3 tests, 8 assertions.

Extended:

  • classes/reporter.phpget_orphan_modules_report(), requeue_orphan_modules(), build_summary_message().
  • classes/surgeon.phprequeue_orphan_module(int $cmid) static method.
  • cli/fix_course_delete_modules.php--requeue-orphans and --requeue-orphans --fix.
  • fix_module.php — handles requeue_orphan_module and requeue_orphan_modules_all actions; restricted to admins via admin_externalpage_setup() (security hardening).
  • separate_module.php — same admin_externalpage_setup() hardening (drive-by fix).
  • form.php — two new moodleform classes for the buttons.
  • index.php — wires the orphan section into the page render.
  • lang/en/tool_fix_delete_modules.php — new strings, alphabetised.
  • styles.css — section styled to match catalyst's existing .fix-diagnosis / .fix-results look. Outlined #A32D2D buttons, refresh icon, alternate row shading.
  • templates/main_elements.mustache{{{orphans}}} placeholder.
  • version.php — bumped to 2026050800 / 0.2.0. $plugin->supported left at [35, 405] (matches catalyst's current declaration; M5.x compat audited as clean — see notes).

Test plan

  • phpunit — 31 tests / 444 assertions, all green (24 existing + 7 new).
  • phplint — clean.
  • phpcs — clean for new code; 2 deliberate inherited-pattern errors (multi-classes-in-file, matching catalyst's existing convention) and 1 inherited lang-ordering warning, all rationalised.
  • phpcpd — clean.
  • phpmd — clean for new code.
  • phpdoc — clean for new code.
  • inline-css — clean.
  • validate — clean.
  • Manual UI smoke test on local Moodle 4.5.8: simulate orphan cm → table renders → click Re-queue → cron drains → verify cm gone.
  • CLI smoke test: --requeue-orphans lists orphans, --fix re-queues via API, cron drains.
  • Idempotency: re-clicking Re-queue on already-deleted cm returns the friendly "had already been removed" message instead of an error.
  • Security: fix_module.php / separate_module.php now require admin via admin_externalpage_setup(). Non-admin POST returns Moodle's standard "you do not have permission" page.

Screenshots

(see screenshots/ folder in this PR's gdrive folder; commit will inline the relevant ones)

Migration notes

  • Existing installs: drop in, run php admin/cli/upgrade.php. Version bump triggers Moodle to register the new strings and DB schema (none changed).
  • Behaviour for existing-user data: if any course_modules.deletioninprogress=1 rows existed before install, they'll now appear in the new orphan section on first page load. Admin can choose to re-queue them or ignore.
  • Backwards compatibility: existing fix_module and separate_module actions are functionally unchanged; only the auth gate has been tightened from require_login()+require_sesskey() to admin_externalpage_setup().

Notes for reviewer

  1. Multi-classes-per-file in form.php — I added two new form classes (requeue_orphan_module_form, requeue_orphan_modules_all_form) following the existing pattern in this file (which already contains two classes). Happy to split into separate files if Catalyst prefers strict PSR-1.
  2. Lang string ordering — new strings are alphabetised among themselves but appended at the end of the file, so they trip a single ordering warning relative to the previous existing string. Distributing them through the existing alphabetical sequence would mean editing layout around catalyst's current strings, which I avoided. Easy to interleave on request.
  3. Auth hardening is a drive-by fix. Pre-existing gap — happy to extract into a separate PR if Catalyst would rather keep this PR scope-pure. Low risk: index.php was always admin-only, so the gap was theoretically exploitable but practically obscure. Closing it costs ~3 lines of code.
  4. Moodle 5.x compat: all new APIs verified against M5.0 STABLE — no hard incompatibilities. $plugin->supported deliberately left at [35, 405] to match catalyst's current support declaration; no changes needed in our diff if Brad later bumps support.

@jamieburgess jamieburgess force-pushed the feature/orphan-cm-cleanup branch from 24165cb to e0175f4 Compare May 11, 2026 02:27
Handles a failure mode the existing flow doesn't cover: course_modules
rows flagged deletioninprogress=1 that have no matching
course_delete_modules adhoc task in the queue ("orphaned" course
modules). This is typically left behind when a worker is killed
mid-execute (worker time-out, OOM, container restart) before the task
framework can mark the task failed -- the cm stays flagged for deletion
on its course page forever, with nothing queued to finish the job.

Detection and re-queue
----------------------
- New \tool_fix_delete_modules\orphan_module_list: enumerates cms with
  deletioninprogress=1 that aren't referenced by any queued
  course_delete_modules task. Avoids JSON_EXTRACT for MySQL/PostgreSQL
  portability.
- surgeon::requeue_orphan_module() / reporter::requeue_orphan_modules():
  re-queue a fresh deletion task via Moodle's own
  course_delete_module($cmid, true) -- the same code path the activity
  "Delete" button uses, so all the normal hooks, observers and events
  fire. The next cron run completes the deletion as usual.
- GUI: index.php gains an "Orphaned course modules" section listing the
  stuck cms, with a per-row "Re-queue stalled deletion" button and a
  bulk "Re-queue all stalled deletions" button
  (templates/orphan_modules_table.mustache, plus the form classes and
  fix_module.php action handlers).
- CLI: fix_course_delete_modules.php gains --requeue-orphans (list the
  orphans) and --requeue-orphans --fix (re-queue them).

Passive monitoring via the Check API
------------------------------------
- New \tool_fix_delete_modules\check\orphan_modules: extends
  \core\check\check. Reuses orphan_module_list for detection, so the
  check is always consistent with what the plugin's report page shows.
  Severity: OK when zero, WARNING when one or more.
- Discovered automatically by \core\check\manager::get_status_checks
  (M3.9+) via the tool_fix_delete_modules_status_checks() callback in
  lib.php, so the check appears on /report/status/index.php and is
  picked up by tool_heartbeat's croncheck.php with no per-site opt-in.
- Action link deep-links to /admin/tool/fix_delete_modules/index.php
  #orphan-modules (id="orphan-modules" anchor added to the section
  wrapper in templates/orphan_modules_table.mustache).

Unit tests
----------
- orphan_module_list_test, surgeon_orphan_requeue_test,
  check_orphan_modules_test.

On Moodle <3.9 the Check API doesn't exist and the callback is simply
never invoked. Version bumped to 0.2.0.
@jamieburgess jamieburgess force-pushed the feature/orphan-cm-cleanup branch from 478332e to 9eb3442 Compare May 11, 2026 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant