feat: AI-policy#1149
Conversation
jfindlay
left a comment
There was a problem hiding this comment.
Sorry for all the comments.
| @@ -0,0 +1,65 @@ | |||
| # AI policy | |||
There was a problem hiding this comment.
I still strongly suggest splitting the guidance into human/agent files. The human guidance can go into an existing file somewhere or stay here, but the agent instructions should go into ./AGENTS.md. For example, GPT 5 and Opus 4.8 have this to say:
how likely for a code agent to automatically find
AI-POLICY.mdin the root of a project without hints, guidance, or prompting?
[GPT 5] Short answer: unlikely in general — roughly 5–30% chance without hints, depending on the agent design.
[Opus 4.8] Short answer: low and highly conditional — without any hint, guidance, or prompting, a code agent is unlikely to read AI-POLICY.md as a matter of course. The probability is dominated by a few structural factors, not by the file's mere existence.
| 3.2. PRs should be split into atomic commits where it makes sense. | ||
|
|
||
| 3.3. Commit titles must describe the lemmas or definitions added or changed. | ||
|
|
There was a problem hiding this comment.
| 3.4. Commit messages should include [a conventional](https://git-scm.com/docs/SubmittingPatches.html#sign-off) `Co-authored-by: [agent-name] [agent-version] <no-reply+[agent-name]@[agent-provider].[tld]>` trailer. |
|
|
||
| 3.3. Commit titles must describe the lemmas or definitions added or changed. | ||
|
|
||
| 3.4. PRs must be focused and have well-defined scope. |
There was a problem hiding this comment.
| 3.4. PRs must be focused and have well-defined scope. | |
| 3.4. PRs must be focused and have well-defined scope. For new work, try to include only a single main result and only the intermediate lemmas strictly needed for it. For revisions and refactors, keep the PR focused on a single main result. |
|
|
||
| 1.6. These guidelines apply to any contribution where an AI tool produced more than trivial assistance. | ||
|
|
||
| 1.7. The human author is fully responsible for every line of the PR. |
There was a problem hiding this comment.
[Human language]
| 1.7. The human author is fully responsible for every line of the PR. | |
| 1.7. The human author(s) are fully responsible for every line of the PR, however it was produced. A human must vouch that each definition, theorem statement, and proof step means what it claims. A clean Lean build proved what was written, but only the human can certify what was written is what they meant. |
|
|
||
| 2.2. Added results must be placed in sections with appropriate titles. Sections must be numbered using the scheme `# A. ...`, `## A.1. ...`, etc. See [Physlib/ClassicalMechanics/HarmonicOscillator/Basic.lean](Physlib/ClassicalMechanics/HarmonicOscillator/Basic.lean) for an example. | ||
|
|
||
| 2.3. Results must be placed in the appropriate file, with the existing library structure in mind. Do not introduce new files without good reason. |
There was a problem hiding this comment.
"appropriate" is doing a lot of work here that I feel like could be further unpacked. Physics is attractive for it's self-evident organization, but collaboration invariably finds competing taxonomies. Perhaps there is some established reference we could point to for precedent, like PDG or similar?
|
|
||
| 2.3. Results must be placed in the appropriate file, with the existing library structure in mind. Do not introduce new files without good reason. | ||
|
|
||
| 2.4. Long proofs (over 50 LoC) should, where sensible, be split into smaller lemmas of general applicability. |
There was a problem hiding this comment.
[Agent language]
| 2.4. Long proofs (over 50 LoC) should, where sensible, be split into smaller lemmas of general applicability. | |
| 2.4. Proof structuring | |
| These are reviewer-judgment questions, not mechanical rules. An agent **should surface a recommendation** when one of the conditions below is met and let a human make the structural call; an agent should **not** split or relocate lemmas unilaterally. | |
| A proof can mix concerns in three distinct ways — **depth**, **breadth**, and **computation/conceptual**. These can co-occur in a single proof; flag whichever discriminator fires and let a human resolve any overlap. | |
| 2.4.1. Topic and layer separation | |
| **Type 1 — Depth mixing** (general plumbing inlined under a physics statement). A self-contained, lower-abstraction sub-argument doing work inside a domain proof. | |
| **Type 2 — Breadth mixing** (two physics topics at the same level). Two distinct physics topics reasoned about in one proof, where each half stands alone. | |
| **Type 3 — Computation vs. conceptual mixing.** A long mechanical calculation welded onto a structural (existence / uniqueness / limit) argument. | |
| **Discriminator table.** | |
| | Type | Discriminator | What the extracted lemma's *statement* looks like | | |
| |---|---|---| | |
| | **Depth** | Contiguous block cites only general (`Finset.*`, `ring`, Mathlib) lemmas, untouched by physics constants | Contains **zero** physics names | | |
| | **Breadth** | Proof cites lemmas from **two disjoint physics topic namespaces** | Each piece sits in **one** topic namespace | | |
| | **Computation/conceptual** | Large contiguous `calc`/`ring_nf`/`field_simp`/`positivity` slab under a structural top-level tactic | A **calculation** lemma (may keep physics constants) | | |
| | **None — leave inline** | Candidate isn't a contiguous subgoal-closing block; or it's one indivisible argument | n/a | | |
| > General rule: if the candidate block cannot be cleanly cut as a contiguous run that closes its own subgoal, do not recommend splitting it. | |
| 2.4.2. Independent sub-results | |
| Distinct from 2.4.1 (which is about *mixing kinds* of reasoning), this asks whether the proof contains a self-contained sub-result that is *independently meaningful* — one another lemma might cite. If so, recommend promoting it to its own lemma. Weigh this against the cost: each new lemma adds a node to the dependency graph and a name to maintain. Promote when the sub-result earns its name through reuse or conceptual significance; leave it inline when it is merely an intermediate step of this one proof. | |
| 2.4.3. Length as a smell | |
| Length is a prompt to ask 2.4.1 and 2.4.2, not an independent reason to split. If a proof is long *and* cannot be decomposed by the principles above flag it for a reviewer but do not manufacture artificial sub-lemmas to shorten it, especially for foundational or core-dependency results where added inter-lemma structure can introduce more fragility than the length costs in readability. | |
|
|
||
| 2.6. All content must pass the linters described in [scripts/README.md](scripts/README.md). | ||
|
|
||
| 2.7. Contributions must not contain `axiom` declarations. ch. |
There was a problem hiding this comment.
typo?
| 2.7. Contributions must not contain `axiom` declarations. ch. | |
| 2.7. Contributions must not contain `axiom` declarations. |
|
|
||
| ## 3. Rules for the structure of the PR | ||
|
|
||
| 3.1. A pull request that adds new content (as opposed to refactoring existing content) should contain less than 250 lines of diff. Split larger contributions into multiple PRs. |
There was a problem hiding this comment.
| 3.1. A pull request that adds new content (as opposed to refactoring existing content) should contain less than 250 lines of diff. Split larger contributions into multiple PRs. | |
| 3.1. Pull request scope | |
| A pull request should add a **single coherent concept**. Cohesion is the governing criterion, **not** line count. | |
| Prefer small PRs whenever possible. A large PR that develops one concept to completion is preferable to a small PR that mixes two unrelated concerns. The relevant questions are: | |
| - **Is this one concept?** Every definition and lemma should either *be* that concept or supply the minimal API needed to state and prove it. | |
| - **Does it match the surrounding house pattern?** Where a directory has an established granularity (e.g. one concept per file, developed with its full API), conform to it. | |
| - **Is the code internally navigable?** A long file that has a module docstring with an overview, key results, and a table of contents provides the reviewability that splitting would otherwise provide without fragmenting one concept across multiple review cycles. | |
| **Line count is a smell, not a gate.** Physlib's per-PR length rules live in the [Review Guidelines](docs/ReviewGuidelines.md). Treat those thresholds as a prompt to *ask the cohesion questions above*, not as a hard limit: | |
| - When a PR passes the "very large" band, ask whether it is really one concept or several, and split **only along conceptual seams**. | |
| - The thresholds are calibrated for **new content**. Pure refactors, file reorganizations, and documentation PRs may safely run larger, since reviewer load per line is far lower. | |
| - An agent **should surface** an oversized-PR concern for a human to weigh; it should not split a PR unilaterally to satisfy a line count. |
|
or, all together, this is what I would recommend: #1187. |
Added an AI-policy.
This was human generated, with an LLM used to improve grammar and wording.