Skip to content

LLM and Agent usage #134

@jpmckinney

Description

@jpmckinney

Tips

Project selection

Choose projects:

  • With high coverage. If missing, use the agent to add more tests, first.
  • With soundness checks. This can be via the Rust compiler or Python type checker (like mypy, but see Look into ty and pyright, compare to mypy, and document our type checking approach #114). If type annotations are missing in Python, use the agent to add more annotations, first.
  • With strict linters and formatters. Our default is to enable all rules in ruff and cargo.

Task selection

Choose tasks:

  • Where you are an expert in the project and task. Do not ask the LLM to solve or build something you couldn't do on your own with your current knowledge.

Agents are also good at:

  • Complex refactoring, e.g. prompting to:
    • rewrite contributed code in the style of the rest of the file or project
    • reduce duplication, single-use functions, single-use variables and single-letter variables
    • extract methods

Planning

  • For complex requests, it is recommended to write a specification (and to ask the LLM to update it as it makes progress). This planning step can be assisted by a different model. Thisblog post has examples.

Steering

  • Agents sometimes "anchor" to a particular direction. When steering the agent off that direction, make sure to delete any generated code in that direction.

Edit prediction

  • When refactoring, this can be especially useful, e.g. after making a change, you can press tab to advance to the next part of the code that needs to be changed (e.g. when changing function signatures).
  • Otherwise, it is often preferable to hide predictions by default (like in Zed's subtle mode), to have fewer distracting suggestions. (Cursor has an open feature request.)

git

  • Use git worktree so that you and the agent can work on the repository at the same time, independently.
  • Use git add to stage accepted changes between prompts, and git diff to see incremental changes.

Cursor

  • Create a Cursor Rule whenever you correct the agent, so that you don't need to make the same correction in future. The blog post describes an example workflow.
  • When using Cursor, explicitly set the model to Sonnet (from Claude) when working with Python notebooks; otherwise, the app knows that it needs Sonnet (and displays a message), but the LLM instead keeps retrying to make changes that it can't.
  • Accept or reject changes before changing the code and re-prompting the agent. Otherwise, the agent tends to ignore your changes.
  • Be careful about using undo/redo if the agent has made changes. If you did not accept a changeset, then "redo" might not work to re-suggest those changes.

Recommended reading

Things I read but that were not particularly useful:

Untested

GitHub Copilot

MCPs

In general, try without MCPs. It's easy to load up on MCPs, but: (1) the MCPs might fill up the context, adding "noise" to the "signal" (your prompt); (2) the agent's behavior might become even more unpredictable and harder to steer, under the influence of additional MCPs; (3) the agent needs to know which MCP to use when, which you need to configure; and (4) the authors of the agent have already tried to make it as good as possible, and the MCPs might duplicate or conflict with functionality/tuning of the agent. In other words, only add an MCP that solves a real problem you have experienced.

That said, these are likely useful:

Otherwise, if the agent has access to a terminal, you can instead write a script or Makefile and ask the agent to use that (e.g. to run tests and lints, read logs, etc.).

Reference

Other learning

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions