LLM and Agent usage

## Tips

### Project selection

Choose projects:

* With high coverage. If missing, use the agent to add more tests, first.
* With soundness checks. This can be via the Rust compiler or Python type checker (like mypy, but see https://github.com/open-contracting/software-development-handbook/issues/114). If type annotations are missing in Python, use the agent to add more annotations, first.
* With strict linters and formatters. Our default is to enable all rules in ruff and cargo.

### Task selection

Choose tasks:

* Where you are an expert in the project and task. Do not ask the LLM to solve or build something you couldn't do on your own with your current knowledge.

Agents are also good at:

* Complex refactoring, e.g. prompting to:
  * rewrite contributed code in the style of the rest of the file or project
  * reduce duplication, single-use functions, single-use variables and single-letter variables
  * extract methods

### Planning

* For complex requests, it is recommended to write a specification (and to ask the LLM to update it as it makes progress). This planning step can be assisted by a different model. This[blog post](https://ghuntley.com/specs/) has examples.

### Steering

* Agents sometimes "anchor" to a particular direction. When steering the agent off that direction, make sure to delete any generated code in that direction.

### Edit prediction

* When refactoring, this can be especially useful, e.g. after making a change, you can press `tab` to advance to the next part of the code that needs to be changed (e.g. when changing function signatures).
* Otherwise, it is often preferable to hide predictions by default (like in Zed's [subtle mode](https://zed.dev/docs/ai/edit-prediction)), to have fewer distracting suggestions. (Cursor has an [open feature request](https://forum.cursor.com/t/zed-style-subtle-mode/95832).)

### git

* Use [`git worktree`](https://git-scm.com/docs/git-worktree) so that you and the agent can work on the repository at the same time, independently.
* Use `git add` to stage accepted changes between prompts, and `git diff` to see incremental changes.

### [Cursor](https://www.cursor.com)

* Create a [Cursor Rule](https://docs.cursor.com/context/rules) whenever you correct the agent, so that you don't need to make the same correction in future. The [blog post](https://ghuntley.com/stdlib/) describes an example workflow.
* When using Cursor, explicitly set the model to Sonnet (from Claude) when working with Python notebooks; otherwise, the app knows that it needs Sonnet (and displays a message), but the LLM instead keeps retrying to make changes that it can't.
* Accept or reject changes before changing the code and re-prompting the agent. Otherwise, the agent tends to ignore your changes.
* Be careful about using undo/redo if the agent has made changes. If you did not accept a changeset, then "redo" might not work to re-suggest those changes.

## Recommended reading

* Overall: https://simonwillison.net/2025/Jun/18/coding-agents/
* https://simonwillison.net/about/#subscribe

Things I read but that were not particularly useful:

* https://devin.ai/agents101

## Untested

### GitHub Copilot

* https://docs.github.com/en/copilot/using-github-copilot/using-github-copilot-to-create-issues 
* https://docs.github.com/en/copilot/getting-started-with-github-copilot/getting-started-with-github-copilot-in-visual-studio-code

### MCPs

In general, try without MCPs. It's easy to load up on MCPs, but: (1) the MCPs might fill up the context, adding "noise" to the "signal" (your prompt); (2) the agent's behavior might become even more unpredictable and harder to steer, under the influence of additional MCPs; (3) the agent needs to know which MCP to use when, which you need to configure; and (4) the authors of the agent have already tried to make it as good as possible, and the MCPs might duplicate or conflict with functionality/tuning of the agent. In other words, only add an MCP that solves a real problem you have experienced.

That said, these are likely useful:

* https://github.com/microsoft/playwright-mcp (used)
* https://github.com/github/github-mcp-server
* https://docs.anthropic.com/en/docs/claude-code/mcp#example%3A-monitor-errors-with-sentry
* https://github.com/googleapis/genai-toolbox or similar for PostgreSQL

Otherwise, if the agent has access to a terminal, you can instead write a script or Makefile and ask the agent to use that (e.g. to run tests and lints, read logs, etc.).

* https://github.com/oraios/serena: Language server. Works with [Claude Code](https://github.com/oraios/serena#claude-code), [Cursor](https://github.com/oraios/serena#other-mcp-clients-cline-roo-code-cursor-windsurf-etc), presumably [Zed](https://zed.dev/docs/ai/mcp#add-your-own-mcp-server).
* https://github.com/upstash/context7: Access up-to-date dependency documentation. [Works with](https://github.com/upstash/context7#%EF%B8%8F-installation) Claude Code, Cursor, Zed.
* https://github.com/BeehiveInnovations/zen-mcp-server: Access other LLMs from CLI agents like Claude Code or Gemini CLI. Has orchestration tools.

### Reference

* https://www.youtube.com/watch?v=nfOVgz_omlU (not yet watched)

## Other learning

* Fine-tuning https://ai.google.dev/gemma/docs/core/huggingface_text_full_finetune
* Building MCPs https://gofastmcp.com/getting-started/welcome

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM and Agent usage #134

Tips

Project selection

Task selection

Planning

Steering

Edit prediction

git

Cursor

Recommended reading

Untested

GitHub Copilot

MCPs

Reference

Other learning

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

LLM and Agent usage #134

Description

Tips

Project selection

Task selection

Planning

Steering

Edit prediction

git

Cursor

Recommended reading

Untested

GitHub Copilot

MCPs

Reference

Other learning

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions