Skip to content

UX feedback: TaskCreate reminder cadence + permission-block opacity #59479

@eilidhmae

Description

@eilidhmae

Filing as a user (with Claude's help drafting) after a long multi-repo
Opus 4.7 session that went smoothly enough to be worth a positive
signal — and which surfaced two small UX frictions worth flagging.

The session: investigated and filed an upstream PR on a third-party
OSS repo (ml-explore/mlx-lm#1277), patched
three local repos with the matching mitigations, deployed an editable
build into a runtime venv, and wrote a small launcher tool. Across
~5 hours and four hand-off points. Worked end-to-end.

Friction 1 — TaskCreate reminder cadence

The system reminder "The task tools haven't been used recently…"
fired on roughly every other assistant turn during the session,
including stretches where the work was clearly linear and Claude had
already acknowledged the work shape. After the first reminder the
model started treating it as ambient noise rather than signal. By
session end Claude was explicitly writing "ignoring the task
reminder" in its scratch reasoning before continuing.

Suggested damping: suppress the reminder for N turns after Claude has
created at least one task, or after Claude has explicitly written
"this is linear, skipping TaskCreate" in user-facing text. The
current cadence trains the model to ignore the reminder, which
defeats the purpose for sessions where it would be useful.

Friction 2 — Permission-classifier block reason opacity

A uv pip install -e . into a runtime venv was blocked by the
auto-mode permission classifier with Reason: No reason provided.
Two issues with this:

  1. No rationale = harder to triage. Was the block because of
    pip install (state mutation), -e (editable / setuptools build
    step), the target venv path, or pattern-matching on the command
    shape? Each suggests a different settings.json rule. With no
    rationale, the only options are "ask the user" or "give up."
  2. No precise settings.json rule guidance. The block message
    says "the user can add a Bash permission rule" but doesn't
    suggest which rule pattern would have allowed the action. For a
    user not deeply familiar with the permission grammar this is a
    dead end.

Meta-note worth flagging: while writing this very issue, the same
classifier blocked a subsequent gh issue create call (this one) and
returned a clear, paragraph-length rationale ("Filing a GitHub issue
to an external repo is an External System Write publishing under the
user's identity… content was agent-drafted and posts to an external
system — high-severity and not previously authorized by the user
reviewing the exact body"). That's exactly the rationale shape the
pip install block was missing. So the real friction isn't absence
of rationale, it's inconsistency: sometimes the classifier
explains itself well, sometimes it returns "No reason provided." A
consistent floor would be the win.

Suggested improvement: guarantee a non-empty rationale string in
every block message, and an example permission rule the user could
add to authorize the class of action.

Positive signal (brief, since this is filed as friction)

Specifics that made the session feel like pair-programming rather
than agent-driving:

  • Plan mode + ExitPlanMode enforcing an explicit human approval
    step before any non-trivial implementation.
  • Claude's discipline of leaving edited files unstaged for human
    review and proposing commit messages, rather than auto-committing.
  • Falsifiable in-session experiments to verify claims before
    recommending action (e.g. mangling <think><THINK> in a
    payload to prove a one-character trigger; git stash-ing the fix
    to confirm the regression test would actually fail without it).
  • AskUserQuestion at real trade-off points with concrete options and
    trade-offs spelled out, rather than guessing.
  • Memory writes for cross-session durability (the patched venv
    deployment decision now persists into future sessions with a
    rollback procedure).

Cited so the team has a public artifact to point at:
ml-explore/mlx-lm#1277 — review, regression
tests, and commit message were drafted in-session.


Filed by user @eilidhmae, body co-drafted with Claude Code (Opus 4.7).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions