Skip to content

feat: AI-policy#1149

Draft
jstoobysmith wants to merge 3 commits into
leanprover-community:masterfrom
jstoobysmith:AI-Policy
Draft

feat: AI-policy#1149
jstoobysmith wants to merge 3 commits into
leanprover-community:masterfrom
jstoobysmith:AI-Policy

Conversation

@jstoobysmith

Copy link
Copy Markdown
Member

Added an AI-policy.

This was human generated, with an LLM used to improve grammar and wording.

@jstoobysmith jstoobysmith marked this pull request as draft June 5, 2026 12:51
@jstoobysmith jstoobysmith added the awaiting-author A reviewer has asked the author a question or requested changes label Jun 11, 2026
@jstoobysmith jstoobysmith removed the awaiting-author A reviewer has asked the author a question or requested changes label Jun 14, 2026

@jfindlay jfindlay left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for all the comments.

Comment thread AI-POLICY.md
@@ -0,0 +1,65 @@
# AI policy

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still strongly suggest splitting the guidance into human/agent files. The human guidance can go into an existing file somewhere or stay here, but the agent instructions should go into ./AGENTS.md. For example, GPT 5 and Opus 4.8 have this to say:

how likely for a code agent to automatically find AI-POLICY.md in the root of a project without hints, guidance, or prompting?

[GPT 5] Short answer: unlikely in general — roughly 5–30% chance without hints, depending on the agent design.

[Opus 4.8] Short answer: low and highly conditional — without any hint, guidance, or prompting, a code agent is unlikely to read AI-POLICY.md as a matter of course. The probability is dominated by a few structural factors, not by the file's mere existence.

Comment thread AI-POLICY.md
3.2. PRs should be split into atomic commits where it makes sense.

3.3. Commit titles must describe the lemmas or definitions added or changed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3.4. Commit messages should include [a conventional](https://git-scm.com/docs/SubmittingPatches.html#sign-off) `Co-authored-by: [agent-name] [agent-version] <no-reply+[agent-name]@[agent-provider].[tld]>` trailer.

Comment thread AI-POLICY.md

3.3. Commit titles must describe the lemmas or definitions added or changed.

3.4. PRs must be focused and have well-defined scope.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3.4. PRs must be focused and have well-defined scope.
3.4. PRs must be focused and have well-defined scope. For new work, try to include only a single main result and only the intermediate lemmas strictly needed for it. For revisions and refactors, keep the PR focused on a single main result.

Comment thread AI-POLICY.md

1.6. These guidelines apply to any contribution where an AI tool produced more than trivial assistance.

1.7. The human author is fully responsible for every line of the PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Human language]

Suggested change
1.7. The human author is fully responsible for every line of the PR.
1.7. The human author(s) are fully responsible for every line of the PR, however it was produced. A human must vouch that each definition, theorem statement, and proof step means what it claims. A clean Lean build proved what was written, but only the human can certify what was written is what they meant.

Comment thread AI-POLICY.md

2.2. Added results must be placed in sections with appropriate titles. Sections must be numbered using the scheme `# A. ...`, `## A.1. ...`, etc. See [Physlib/ClassicalMechanics/HarmonicOscillator/Basic.lean](Physlib/ClassicalMechanics/HarmonicOscillator/Basic.lean) for an example.

2.3. Results must be placed in the appropriate file, with the existing library structure in mind. Do not introduce new files without good reason.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"appropriate" is doing a lot of work here that I feel like could be further unpacked. Physics is attractive for it's self-evident organization, but collaboration invariably finds competing taxonomies. Perhaps there is some established reference we could point to for precedent, like PDG or similar?

Comment thread AI-POLICY.md

2.3. Results must be placed in the appropriate file, with the existing library structure in mind. Do not introduce new files without good reason.

2.4. Long proofs (over 50 LoC) should, where sensible, be split into smaller lemmas of general applicability.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent language]

Suggested change
2.4. Long proofs (over 50 LoC) should, where sensible, be split into smaller lemmas of general applicability.
2.4. Proof structuring
These are reviewer-judgment questions, not mechanical rules. An agent **should surface a recommendation** when one of the conditions below is met and let a human make the structural call; an agent should **not** split or relocate lemmas unilaterally.
A proof can mix concerns in three distinct ways — **depth**, **breadth**, and **computation/conceptual**. These can co-occur in a single proof; flag whichever discriminator fires and let a human resolve any overlap.
2.4.1. Topic and layer separation
**Type 1 — Depth mixing** (general plumbing inlined under a physics statement). A self-contained, lower-abstraction sub-argument doing work inside a domain proof.
**Type 2 — Breadth mixing** (two physics topics at the same level). Two distinct physics topics reasoned about in one proof, where each half stands alone.
**Type 3 — Computation vs. conceptual mixing.** A long mechanical calculation welded onto a structural (existence / uniqueness / limit) argument.
**Discriminator table.**
| Type | Discriminator | What the extracted lemma's *statement* looks like |
|---|---|---|
| **Depth** | Contiguous block cites only general (`Finset.*`, `ring`, Mathlib) lemmas, untouched by physics constants | Contains **zero** physics names |
| **Breadth** | Proof cites lemmas from **two disjoint physics topic namespaces** | Each piece sits in **one** topic namespace |
| **Computation/conceptual** | Large contiguous `calc`/`ring_nf`/`field_simp`/`positivity` slab under a structural top-level tactic | A **calculation** lemma (may keep physics constants) |
| **None — leave inline** | Candidate isn't a contiguous subgoal-closing block; or it's one indivisible argument | n/a |
> General rule: if the candidate block cannot be cleanly cut as a contiguous run that closes its own subgoal, do not recommend splitting it.
2.4.2. Independent sub-results
Distinct from 2.4.1 (which is about *mixing kinds* of reasoning), this asks whether the proof contains a self-contained sub-result that is *independently meaningful* — one another lemma might cite. If so, recommend promoting it to its own lemma. Weigh this against the cost: each new lemma adds a node to the dependency graph and a name to maintain. Promote when the sub-result earns its name through reuse or conceptual significance; leave it inline when it is merely an intermediate step of this one proof.
2.4.3. Length as a smell
Length is a prompt to ask 2.4.1 and 2.4.2, not an independent reason to split. If a proof is long *and* cannot be decomposed by the principles above flag it for a reviewer but do not manufacture artificial sub-lemmas to shorten it, especially for foundational or core-dependency results where added inter-lemma structure can introduce more fragility than the length costs in readability.

Comment thread AI-POLICY.md

2.6. All content must pass the linters described in [scripts/README.md](scripts/README.md).

2.7. Contributions must not contain `axiom` declarations. ch.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Suggested change
2.7. Contributions must not contain `axiom` declarations. ch.
2.7. Contributions must not contain `axiom` declarations.

Comment thread AI-POLICY.md

## 3. Rules for the structure of the PR

3.1. A pull request that adds new content (as opposed to refactoring existing content) should contain less than 250 lines of diff. Split larger contributions into multiple PRs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3.1. A pull request that adds new content (as opposed to refactoring existing content) should contain less than 250 lines of diff. Split larger contributions into multiple PRs.
3.1. Pull request scope
A pull request should add a **single coherent concept**. Cohesion is the governing criterion, **not** line count.
Prefer small PRs whenever possible. A large PR that develops one concept to completion is preferable to a small PR that mixes two unrelated concerns. The relevant questions are:
- **Is this one concept?** Every definition and lemma should either *be* that concept or supply the minimal API needed to state and prove it.
- **Does it match the surrounding house pattern?** Where a directory has an established granularity (e.g. one concept per file, developed with its full API), conform to it.
- **Is the code internally navigable?** A long file that has a module docstring with an overview, key results, and a table of contents provides the reviewability that splitting would otherwise provide without fragmenting one concept across multiple review cycles.
**Line count is a smell, not a gate.** Physlib's per-PR length rules live in the [Review Guidelines](docs/ReviewGuidelines.md). Treat those thresholds as a prompt to *ask the cohesion questions above*, not as a hard limit:
- When a PR passes the "very large" band, ask whether it is really one concept or several, and split **only along conceptual seams**.
- The thresholds are calibrated for **new content**. Pure refactors, file reorganizations, and documentation PRs may safely run larger, since reviewer load per line is far lower.
- An agent **should surface** an oversized-PR concern for a human to weigh; it should not split a PR unilaterally to satisfy a line count.

@jfindlay

Copy link
Copy Markdown
Contributor

or, all together, this is what I would recommend: #1187.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants