Add an LLM policy for rust-lang/rust#1040
Conversation
|
r? @jieyouxu rustbot has assigned @jieyouxu. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
@rustbot label T-libs T-compiler T-rustdoc T-bootstrap |
## Summary [summary]: #summary This document establishes a policy for how LLMs can be used when contributing to `rust-lang/rust`. Subtrees, submodules, and dependencies from crates.io are not in scope. Other repositories in the `rust-lang` organization are not in scope. This policy is intended to live in [Forge](https://forge.rust-lang.org/) as a living document, not as a dead RFC. It will be linked from `CONTRIBUTING.md` in rust-lang/rust as well as from the rustc- and std-dev-guides. ## Moderation guidelines This PR is preceded by [an enormous amount of discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/588130-project-llm-policy). Almost every conceivable angle has been discussed to death; there have been upwards of 3000 messages, not even counting discussion on GitHub. We initially doubted whether we could reach consensus at all. Therefore, we ask to bound the scope of this PR specifically to the policy itself. In particular, we mark several topics as out of scope below. We still consider these topics to be important, we simply do not believe this is the right place to discuss them. No comment on this PR may mention the following topics: - Long-term social or economic impact of LLMs - The environmental impact of LLMs - Anything to do with the copyright status of LLM output - Moral judgements about people who use LLMs We have asked the moderation team to help us enforce these rules. ## Feedback guidelines We are aware that parts of this policy will make some people very unhappy. As you are reading, we ask you to consider the following. - Can you think of a *concrete* improvement to the policy that addresses your concern? Consider: - Whether your change will make the policy harder to moderate - Whether your change will make it harder to come to a consensus - Does your concern need to be addressed before merging or can it be addressed in a follow-up? - Keep in mind the cost of *not* creating a policy. ### If your concern is for yourself or for your team - What are the *specific* parts of your workflow that will be disrupted? - In particular we are *only* interested in workflows involving `rust-lang/rust`. Other repositories are not affected by this policy and are therefore not in scope. - Can you live with the disruption? Is it worth blocking the policy over? --- Previous versions of this document were discussed on Zulip, and we have made edits in responses to suggestions there. ## Motivation [motivation]: #motivation - Many people find LLM-generated code and writing deeply unpleasant to read or review. - Many people find LLMs to be a significant aid to learning and discovery. - `rust-lang/rust` is currently dealing with a deluge of low-effort "slop" PRs primarily authored by LLMs. - Having *a* policy makes these easier to moderate, without having to take every single instance on a case-by-case basis. This policy is *not* intended as a debate over whether LLMs are a good or bad idea, nor over the long-term impact of LLMs. It is only intended to set out the future policy of `rust-lang/rust` itself. ## Drawbacks [drawbacks]: #drawbacks - This bans some valid usages of LLMs. We intentionally err on the side of banning too much rather than too little in order to make the policy easy to understand and moderate. - This intentionally does not address the moral, social, and environmental impacts of LLMs. These topics have been extensively discussed on Zulip without reaching consensus, but this policy is relevant regardless of the outcome of these discussions. - This intentionally does not attempt to set a project-wide policy. We have attempted to come to a consensus for upwards of a month without significant process. We are cutting our losses so we can have *something* rather than adhoc moderation decisions. - This intentionally does not apply to subtrees of rust-lang/rust. We don't have the same moderation issues there, so we don't have time pressure to set a policy in the same way. ## Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives - We could create a project-wide policy, rather than scoping it to `rust-lang/rust`. This has the advantage that everyone knows what the policy is everywhere, and that it's easy to make things part of the mono-repo at a later date. It has the disadvantage that we think it is nigh-impossible to get everyone to agree. There are also reasons for teams to have different policies; for example, the standard for correctness is much higher within the compiler than within Clippy. - We could have a more strict policy that removes the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html) condition. This has the advantage that our policy becomes easier to moderate and understand. It has the disadvantage that it becomes easy for people to intend to follow the policy, but be put in a position where their only choices are to either discard the PR altogether, rewrite it from scratch, or tell "white lies" about whether an LLM was involved. - We could have a more strict policy that bans LLMs altogether. It seems unlikely we will be able to agree on this, and we believe attempting it will cause many people to leave the project. ## Prior art [prior-art]: #prior-art This prior art section is taken almost entirely from [Jane Lusby's summary of her research](rust-lang/leadership-council#273 (comment)), although we have taken the liberty of moving the Rust project's prior art to the top. We thank her for her help. ### Rust - [Moderation team's spam policy](https://github.com/rust-lang/moderation-team/blob/main/policies/spam.md/#fully-or-partially-automated-contribs) - [Compiler team's "burdensome PRs" policy](rust-lang/compiler-team#893) ### Other organizations These are organized along a spectrum of AI friendliness, where top is least friendly, and bottom is most friendly. - full ban - [postmarketOS](https://docs.postmarketos.org/policies-and-processes/development/ai-policy.html) - also explicitly bans encouraging others to use AI for solving problems related to postmarketOS - multi point ethics based rational with citations included - [zig](https://ziglang.org/code-of-conduct/) - philosophical, cites [Profession (novella)](https://en.wikipedia.org/wiki/Profession_(novella)) - rooted in concerns around the construction and origins of original thought - [servo](https://book.servo.org/contributing/getting-started.html#ai-contributions) - more pragmatic, directly lists concerns around ai, fairly concise - [qemu](https://www.qemu.org/docs/master/devel/code-provenance.html#use-of-ai-content-generators) - pragmatic, focuses on copyright and licensing concerns - explicitly allows AI for exploring api, debugging, and other non generative assistance, other policies do not explicitly ban this or mention it in any way - allowed with supervision, human is ultimately responsible - [scipy](https://github.com/scipy/scipy/pull/24583/changes) - strict attribution policy including name of model - [llvm](https://llvm.org/docs/AIToolPolicy.html) - [blender](https://devtalk.blender.org/t/ai-contributions-policy/44202) - [linux kernel](https://kernel.org/doc/html/next/process/coding-assistants.html) - quite concise but otherwise seems the same as many in this category - [mesa](https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/docs/submittingpatches.rst) - framed as a contribution policy not an AI policy, AI is listed as a tool that can be used but emphasizes same requirements that author must understand the code they contribute, seems to leave room for partial understanding from new contributors. > Understand the code you write at least well enough to be able to explain why your changes are beneficial to the project. - [forgejo](https://codeberg.org/forgejo/governance/src/branch/main/AIAgreement.md) - bans AI for review, does not explicitly require contributors to understand code generated by ai. One could interpret the "accountability for contribution lies with contributor even if AI is used" line as implying this requirement, though their version seems poorly worded imo. - [firefox](https://firefox-source-docs.mozilla.org/contributing/ai-coding.html) - [ghostty](https://github.com/ghostty-org/ghostty/blob/main/AI_POLICY.md) - pro-AI but views "bad users" as the source of issues with it and the only reason for what ghostty considers a "strict AI policy" - [fedora](https://communityblog.fedoraproject.org/council-policy-proposal-policy-on-ai-assisted-contributions/) - clearly inspired and is cited by many of the above, but is definitely framed more pro-ai than the derived policies tend to be - [curl](https://curl.se/dev/contribute.html#on-ai-use-in-curl) - does not explicitly require humans understand contributions, otherwise policy is similar to above policies - [linux foundation](https://www.linuxfoundation.org/legal/generative-ai) - encourages usage, focuses on legal liability, mentions that tooling exists to help automate managing legal liability, does not mention specific tools - In progress - NixOS - NixOS/nixpkgs#410741 ## Unresolved questions [unresolved-questions]: #unresolved-questions See the "Moderation guidelines" and "Drawbacks" section for a list of topics that are out of scope.
There was a problem hiding this comment.
I really like this version, and thanks a ton for working on it. Specifically:
- It doesn't try to dump entire walls of text, which is unfortunately a good way to be sure nobody reads it. Instead, it gives you concrete examples, and a guiding rule-of-thumb for uncovered scenarios, and acknowledges upfront that it surely cannot be exhaustive.
- I also like where it points out the nuance and recognizes the uncertainties.
- I like that it covers both "producers" and "consumers" (with nuance that reviewers can also technically use LLMs in ways that are frustrating to the PR authors!)
I left a few suggestions / nits, but even without them this is still a very good start IMO.
(Will not leave an explicit approval until we establish wider consensus, which likely will take the form of 4-team joint FCP.)
|
The links to Zulip are project-private, FWIW. |
I'm aware. This PR is targeted towards Rust project members moreso than the broad community. |
| ### "originally authored" | ||
|
|
||
| This document uses the phrase "originally authored" to mean "text that was generated by an LLM (and then possibly edited by a human)". | ||
| No amount of editing can change authorship; authorship sets the initial style and it is very hard to change once it's set. |
There was a problem hiding this comment.
| No amount of editing can change authorship; authorship sets the initial style and it is very hard to change once it's set. | |
| In the manner the phrase is used in this policy, no amount of editing changes how something was "originally authored"; authorship sets the initial style and it is very hard to change once it's set. |
Taking a different approach here, of narrowing the focus to the phrasing in this policy, rather than trying to get people to agree with the fully general statement.
| @@ -0,0 +1,116 @@ | |||
| ## Policy | |||
There was a problem hiding this comment.
| ## Policy | |
| ## Interim LLM Usage Policy |
Adding a title that mentions LLM usage, and flagging this as interim to foreshadow the section at the end noting that policies may evolve.
I am hopeful that this is capturing a sentiment shared both by people who want the policy to be stricter and bypeople who want the policy to be less strict.
There was a problem hiding this comment.
I stand by this policy. I would be happy for this to be a semi-permanent policy. We can of course edit it, but I consider "interim" to be a forward-looking statement and I don't want to make those in this policy.
|
|
||
| ### The meaning of "originally authored" | ||
|
|
||
| This document uses the phrase "originally authored" to mean "text that was generated by an LLM (and then possibly edited by a human)". |
There was a problem hiding this comment.
I’m not comfortable with the definition of "originally authored" as written here. Authorship is something that applies to a person, not tools; a LLM can generate text, but it isn’t an author.
There was a problem hiding this comment.
I've been thinking for a while @jyn514 about what I think about these policy proposals. I think the bottom line for me is that I feel like we are working on pretty incomplete data and I feel ungreat about it.
I would prefer if we established an explicitly temporary policy now and, at the same time, a process and a timeline for establishing a better one. In that circumstances, I would be ok with a restrictive policy like this one. As it is, I feel uncomfortable with it, because it feels like we are letting a handful of loud voices dictate the direction we are going (I could myself amongst the handful of people who've been speaking loudly, though I don't see this policy reflecting my opinions in particular).
| - Usages that use LLMs for creation or show LLM output to another human are likely banned ❌ | ||
|
|
||
| This policy is not set in stone. | ||
| We can evolve it as we gain more experience working with LLMs. |
There was a problem hiding this comment.
I would feel better if we made this policy explicitly time-limited or tied to a process of gathering more information.
There was a problem hiding this comment.
Niko, you're one of the loudest voices trying to dictate the direction we're going. I would argue that a majority of the pushback from sensible policies like this one have come from you; since you're effectively the project manager for the project, your voice carries further than a dozen people's, and it feels like you're genuinely oblivious to this. Plus, a lot of the arguments you've offered have been from the position that whatever you think is reasonable is canonically reasonable, which is perspective that resists all form of negotiation.
We all agree that this policy is not going to be permanent, but a large portion of the project seems to be in agreement that this should be the policy we adopt until a project-wide policy is adopted.
It's also worth noting, since it's been brought up multiple times, that we don't do policy by majority vote. This is even true for a policy like this one: if we did majority vote, we'd just ban all LLM usage, but we're not doing that because we're willing to compromise.
Right now, it seems pretty unsubstantiated that a handful of voices have dictated this position. While it's true that a small number of people have been active in the policy channel, a majority of the project have pointed out their desire for a total ban on LLM usage. This, being noticeably more lenient on that, is a compromise from us. You should consider whether you're willing to compromise at all on your stance, and what compromise would mean for you.
As I mentioned in one of the discussions, I do think it's a false equivalence that both sides need to concede something, but if you don't even know what it means to compromise, then negotiation is utterly impossible. I really am not convinced that you understand what a compromise of the pro-LLM position would be, based upon the utter confusion you've expressed when mentioning that some of the contributions you've done would not be acceptable under some of the proposed policies.
There was a problem hiding this comment.
Do not plan to actually engage in this conversation any further (I acknowledge my biases and when to step out), but I think it's worth pointing out to the at-least-5 people who gave a thumbs-down reaction to my comment that I personally have a rule when it comes to this.
If I ever decide to mark my dissent on a comment with the thumbs-down emoji, I always reply explaining why unless everything I wish to say has already been said. Many times, the result is far more critical to the poster than a simple emoji, but I do this because I genuinely want people to understand why I feel a particular way, rather than just saying "I don't like this and will not explain why." We don't improve if we don't know what's wrong.
My above comment, in my mind, is required to give Niko's a thumbs-down reaction, because otherwise I'm being insincere to him and everyone else reading. I do not say that I disagree with something without saying why; in that case, it's better to not say anything at all.
Again, I acknowledge that my explanation can be deeply hurtful. Disagreement is a painful but necessary process. I also know that there are plenty of times where I have been excessively hurtful without providing the relevant constructive feedback, and think it's worth calling me out for that.
I don't apply my standards to anyone else. Lots of people just don't have time to write up a full response. But I personally, in these cases, simply don't respond at all.
So, consider whether your simple thumbs-down emoji constitutes genuinely useful feedback, or whether you're just being excessively hurtful instead. And, if you would like to express your dissent in private, I'm open to DMs on Zulip too; this is an open invitation to just say what you feel without a filter. It would be hypocritical of me to be so blunt with my opinions and not accept the same in kind.
There was a problem hiding this comment.
I'll reply here since this is 'the thread', but I want to say first that I don't agree with much of what you wrote @clarfonthey. I believe Niko is raising a concern in good faith, though I'd like to understand it better.
I would feel better if we made this policy explicitly time-limited or tied to a process of gathering more information.
@nikomatsakis, can you elaborate on specifically what information you think we should be seeking, and what process you're imagining for iterating towards a better policy? Are you seeking commitment from folks who have engaged so far to keep engaging in discussion on Zulip? Something else? I'd be happy to chat offline (Zulip or more sync meeting if you'd prefer) if that makes more sense.
I see @jyn514 left a comment below with some more data on project opinions, but it's not clear to me if that's the kind of data you're seeking, or something else. Could you elaborate on what you're looking for and what kinds of process/timeline you would find better than the copious discussion and iteration that has landed us on this (and some other) proposals?
I personally think a policy like this one that is relatively restrictive, but scope-limited and leaving room for usage in other areas of the Project gives us a good balance for continued input on where the world is and leaving the door open for private usage for those comfortable with doing so. That combination seems guaranteed to ensure we're not going to stop discussing since everyone seems to want something different from this policy, even if we manage to get to consensus on landing this in the meantime.
There was a problem hiding this comment.
I appreciate the vote of confidence, Mark. And @clarfonthey I appreciate that I have reputational clout in the project -- though I'd also note that it doesn't usually translate into me getting my way without fighting for it tooth and nail. =) In any case, I wouldn't be speaking up this much if I didn't feel it was important.
To answer your questions Mark:
Are you seeking commitment from folks who have engaged so far to keep engaging in discussion on Zulip?
No, I think the Zulip discussions are not useful. I want to see a more structured process. I think it would look like this:
- First, there'd be a group of people who are working to form a policy. This would be a representative, high-trust group that contains some folks from various positions here. And for the record, I don't particularly want to be in it. =)
- Second, I think it'd be useful to take the next step to making the qualitative data we gained on Rust Project Perspectives into quantitative data. I talked about this on Zulip a few times but one idea is to do targeted polling to try and figure out "how widely are each of the major families of concerns shared" and "what is the texture".
For example, @jyn514 has expressed openness to having a separate review queue for "LLM-authored content". How many others on the compiler team share that opinion? I have no idea. And of course @clarfonthey has expressed ethical concerns, and I don't really know how many people share that bright red line. And that's just existing maintainers, what about people who've opened PRs in the last year? How many of them work with LLMs at work or on a daily basis? What are there experiences like?
Another thing I'm very curious to understand, something I think could be useful, is -- what are people afraid of or hopeful for as a result of this policy. That might inform the conversation.
For example, for me, one of my big fears is that if we will be distancing ourselves from future contributors, many of whom will be coding with LLMs. When Rust started, we made a deliberate choice to use Github and not Bugzilla because, frankly, Github is where the people are. I would be interested to see if the perspectives around LLM usage vary between existing maintainers and future contributors or along other lines.
There was a problem hiding this comment.
According to https://rethinkpriorities.org/research-area/adoption-llms-tech-workers/ 91% of respondants have used LLMs for work with 29% using it daily.
This data is a year old now, and I have many reasons to believe usage of tools like Claude Code has only increased since then, and dramatically at that. Of the four tech companies I have direct knowledge of, all four have gone from AI being used for coding by a minority of developers, to being used by almost every developer for coding in that time. In two of them using AI coding tools is practically mandatory. It's also quite clear to everyone that companies like Anthropic are struggling to keep up with the growth in usage.
I personally have approached AI with extreme skepticism from the beginning, and I still consider its functionality to be dramatically oversold by the companies selling it, but it's extremely widely used already, is already a very effective tool when used correctly, and I think @nikomatsakis is absolutely correct in thinking this will distance contributors who would use it, which is now essentially all new programmers.
There was a problem hiding this comment.
At the company I work for, it's also the case that most people use AI quite a bit; however, multiple new employees have expressed that they don't want to use it much or at all because it could interfere with learning. They want to go from junior engineers to senior engineers, and the best way we know to do that is hands-on experience.
So I agree that it's become an industry standard (and policies which do not reflect this may be unsustainable); however, it's not necessarily true that all new programmers will be AI users initially.
|
@nikomatsakis You've talked a few times about having more quantitative data. I have some new quantitative data for you: people have been talking to me in private since this policy was posted, including several people who I haven't heard from publicly. Here are statistics on their opinions: The 3 people who are unhappy all want a less restrictive policy, and they've all spoken publicly: It's you, TC, and Tobias. Now, this isn't representative — I haven't heard from some people who I know weren't in favor of this policy, so this is probably leaning towards people who support it. But I still find it quite convincing as an indication that there isn't some kind of "silent majority" who oppose the policy. I want to quote this bit from the PR description:
Do you think this policy is so restrictive that it's worse than not having a policy at all? If not, what do you suggest if your "timeline for establishing a better policy" elapses and we haven't come to a consensus? |
| - Using machine-translation (e.g. Google Translate) from your native language without posting your original message. | ||
| Doing so can introduce new miscommunications that weren't there originally, and prevents someone who speaks the language from providing a better translation. | ||
| - ℹ️ Posting both your original message and the translated version is always ok, but you must still disclose that machine-translation was used. | ||
| - Using an LLM as a "review bot" for PRs. |
There was a problem hiding this comment.
Maybe I'm OOTL but I find this section situationally strange — where did the "review bot" come from?
IME AI-powered review bots that directly participates in PR discussions (esp the "app" ones) are configured by repository owner, but AFAIK r-l/r (which this policy applies solely to) did not have any such bots. I highly doubt a contributor will bring in their own review bot in public. So practically this has to be either
- someone requested a review from Copilot, which may be we can opt-out?
- the reviewer outsourced the review work to a coding agent, which is already covered in the sections
- at least one team actually considered enabling such review bots in the future? as this is linked previously in that "Teams can have a policy that code can be merged without review" part, but I don't think this will ever happen given the the stance of this policy
There was a problem hiding this comment.
I highly doubt a contributor will bring in their own review bot in public.
I wish it worked like that :( People can just trigger GitHub copilot, or I suppose any other review bot, and let it comment on a r-l/r PR. Some people don't even do it willingly, but GH does it automatically for them, as GH copilot has a tendency to re-enable itself even if you sometimes disable it.
It is also not possible to opt-out of the PR author requesting a Copilot review, if I remember correctly.
There was a problem hiding this comment.
I’ve seen this behavior elsewhere on GitHub, where contributors effectively use a personal account as a kind of "review bot" to comment on PRs without approval from maintainers.
There was a problem hiding this comment.
It is also not possible to opt-out of the PR author requesting a Copilot review, if I remember correctly.
Yeah currently disabling review is a personal/license-owner setting, it is not possible to configure from the repository PoV 😞 but I think this is something that we may bring up to GitHub.
It may be possible to use content exclusion to blind Copilot, but I'm not sure if this hack is going to produce any overreaching effects (e.g. affecting private IDE usage too).
There was a problem hiding this comment.
someone requested a review from Copilot, which may be we can opt-out?
I think this is exactly the point of pointing that out in our policy. Some people trigger a "[at]copilot review" in our repos without asking us for consent. This is rude behaviour and we don't want that.
And, yes, as you point out opting out of this "trigger" is currently only a project-wide setting, not at a repository level so we are looking with GitHub if they could make this setting more fine-grained (here on Zulip a discussion with the Infra team)
There was a problem hiding this comment.
Well I wouldn't say it's "rude" because GitHub's UI makes it so easy to request a review from Copilot, even manually 😅
IMO this is caused by a decision from the upstream (GitHub), so should better be solved at the GitHub level. This policy about "review bot" is only stop-gap workaround, and placing the burden on the wrong parties. I think we should explore more about disabling Copilot review "statically" (e.g. through GitHub implementing such option; banning Copilot (user ID 175728472) from making comment; using the instruction files etc), rather than relying on a policy.
Not saying that this section should be removed, but maybe informatively explain that "requesting review from Copilot on a PR is typically not welcomed".
There was a problem hiding this comment.
yes I agree, it's a "UI decision" that ends up enabling rude or, if you prefer, unsettling behaviours. I'd like that this policy makes that clear.
ℹ️ Review bots that post without being approved by a maintainer will be banned.
I think this covers enough the "do not prompt LLM reviews on rust-lang/ without consent" part.
There was a problem hiding this comment.
@clarfonthey I understand you are frustrated but it doesn't help to take it out on the people we're working with. Can I ask you to take a break from commenting on this RFC for a bit? Feel free to DM me with any concerns you have about the policy itself.
There was a problem hiding this comment.
yeah, you're right; I deleted the comment
There was a problem hiding this comment.
Sorry, I've said a lot here. I want to push back a bit on the data @jyn514: I wasn't included in that, and I am certainly on the side of "want it to be less restrictive". I think it's definitely not correct to think that there are only three people in the project that would like to see a "less restrictive" version of this - I could probably point out at least 3 people other than myself.
I've suggested some things here. In total, I think it's slightly less restrictive, but imo maintains a lot of the effect of this. Ultimately, I think it's pretty hard because don't have some written shared notion of what problem(s) we're trying to solve and how the policy here does solve those.
I said it in one of my review comments, but regardless of what gets merged here, I don't want us to point to this and think "it's the end" of the discussion. I think it's more than just this policy "not just being set in stone", but rather I want us to think of it as "just a step" to more discussion and data.
| - Code changes that are originally authored by an LLM. | ||
| - This does not include "trivial" changes that do not meet the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html), which fall under ⚠️ below. | ||
| We understand that while asking an LLM research questions it may, unprompted, suggest small changes where there really isn't another way to write it. | ||
| However, you must still type out the changes yourself; you cannot give the LLM write access to your source code. |
There was a problem hiding this comment.
This is very weird to me. Either the change is small enough to be trivial, or it is not. I'm not sure what typing it out does?
Beyond this, it's not clear what this is aimed at? Is this aimed at when someone is conversing back and forth with an agent and they say "I suggest you do XYZ", or is this aimed at autocomplete-like code generation.
There was a problem hiding this comment.
I've removed the requirement to type out the code yourself.
| - This does not include "trivial" changes that do not meet the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html), which fall under ⚠️ below. | ||
| We understand that while asking an LLM research questions it may, unprompted, suggest small changes where there really isn't another way to write it. | ||
| However, you must still type out the changes yourself; you cannot give the LLM write access to your source code. | ||
| - We do not accept PRs made up solely of trivial changes. |
There was a problem hiding this comment.
This is really just not correct. We accept all the time trivial changes (e.g. renaming a struct because it's confusing).
It's sort of like what Josh is saying: what does "trivial" mean.
There was a problem hiding this comment.
I've reworded this significantly, let me know what you think.
| LLM reviews, if enabled by a team, **must** be advisory-only. | ||
| Teams can have a policy that code can be merged without review, and they can have a policy that code must be reviewed by at least one person, |
There was a problem hiding this comment.
Given that this is limited to rust-lang/rust, probably better to just restrict to no LLM reviews.
There was a problem hiding this comment.
I actually really want to keep allowing LLM reviews. I think they're low-risk and give people a chance to see whether the bot catches real issues.
| - Documentation that is originally authored by an LLM. | ||
| - ℹ️ This includes non-trivial source comments, such as doc-comments or multiple paragraphs of non-doc-comments. | ||
| - ℹ️ This includes compiler diagnostics. | ||
| - Code changes that are originally authored by an LLM. |
There was a problem hiding this comment.
This feels overly restrictive in the current wording in a way that I'm not sure I really am comfortable not raising a concern as compiler team member.
There is some nuance here that this doesn't capture that I think should be. Certainly, I think in general, I'm happy to ban "unsolicited" code that is LLM-generated, but I think that an outright ban on all "non-trivial" LLM-generated code is too strong. I'd like to see LLM-generated code allowed under the following strong caveats:
- The reviewer is pre-decided, and has agreed to review LLM-generated code
- Importantly, this does not mean a PR can be opened and then picked up by an "LLM-friendly" reviewer
- The code is well-reviewed (meaning, that the reviewer is committing to ensuring they fully understand the code, well enough that they could easily have written it themselves; and the author has also reviewed the code)
- Changes are "non-critical" (such as a non-compiler tool, code under a feature gate, diagnostics, etc.)
I personally think this is a pretty reasonable space to carve out for "experimentation": it doesn't subject reviewers who don't want to review LLM-generated code to unwanted reviews, it helps to ensure that code stays high-quality, and it limits fallback of any "mistakes" in the process.
There was a problem hiding this comment.
"The code is well-tested" is another valuable caveat to add here. Requiring this is much less onerous in the context of LLM-assisted code.
There was a problem hiding this comment.
I like it. I think it's a standard we want to hold for all contributions, but doesn't always get met. It's a nice position to have here.
| Please avoid them where possible. | ||
| In general, existing contributors will be treated more leniently here than new contributors, | ||
| since they've already established trust with their reviewers. | ||
| We may ask you for the original prompts or design documents that went into the LLM's output; | ||
| please have them on-hand, and be available to personally answer questions about your process. | ||
| We may also ask for the exact LLM model used to generate the output. |
There was a problem hiding this comment.
Please avoid them where possible.
I think this wording should be removed.
will be treated more leniently
I'm not sure what this is supposed to mean: "existing contributors are allowed to do this more than new contributors", "existing contributors aren't scrutinized but new contributors will be"?
We may ask you for the original prompts or design documents that went into the LLM's output;
please have them on-hand, and be available to personally answer questions about your process.
We may also ask for the exact LLM model used to generate the output.
What purpose does this solve? It doesn't even really apply except to the first point anyways?
There was a problem hiding this comment.
I agree that these are no longer relevant now that we don't allow LLMs for non-trivial code generation. I've reworked this intro significantly, let me know what you think.
| please have them on-hand, and be available to personally answer questions about your process. | ||
| We may also ask for the exact LLM model used to generate the output. | ||
|
|
||
| - Using an LLM to generate a solution to an issue, learning from its solution, and then rewriting it from scratch in your own style. |
There was a problem hiding this comment.
Of course, see my comment on the "Code changes that are originally authored by an LLM." ban, but I do like laying out this "less-restrictive" point explicitly. I would move the "asking for details about how you generated the solution" to under this point, but modify it heavily.
Rather than stating like "we need to know exactly what you said to the LLM and what model you used", I think a better approach is saying something like "You should be prepared to share the details of the direction you gave to the LLM. These may include general prompts or design documents/constraints."
I'm not sure that sharing the exact prompts or output, or the exact model does anything. What's the reasoning? I'm much more interested in what direction the author intended to take.
If the idea is to be able to "recreate" or "oversee" what the author did, that's just never going to work. This isn't something we can reasonably expect reviewers at large to do. Rather, if anything, this is something that I could see from a more mentor/mentee relationship. If it ever is at the point that a "random" reviewer wanted or needed to see this, then the PR likely just needs to be closed and further discussion should happen elsewhere before continuing.
| Conversely, lying about whether you've used an LLM is an instant [code of conduct](https://rust-lang.org/policies/code-of-conduct/) violation. | ||
| If you are not sure where something you would like to do falls in this policy, please talk to us. | ||
| Don't try to hide it. |
There was a problem hiding this comment.
I would be slightly stronger here: lying about whether you've used an LLM or how is instant CoC violation
I would also say that under good intent, failure to follow these policies is generally not immediately bannable, but repeated use after a warning is.
Who is "us"? Where does someone go?
There was a problem hiding this comment.
What does something being an "instant CoC violation" actually mean? The moderation team policies suggests a warning:
- Moderators will first respond to such remarks with a warning.
- If the warning is unheeded, the user will be “kicked,” i.e., kicked out of the communication channel to cool off.
- If the user comes back and continues to make trouble, they will be banned, i.e., indefinitely excluded.
There was a problem hiding this comment.
Good comments all; I've revised this portion significantly.
| This policy is not set in stone. | ||
| We can evolve it as we gain more experience working with LLMs. |
There was a problem hiding this comment.
I want to second @nikomatsakis's point, mostly: I'm not sure that I necessarily care that this is "time limited", but with as-restrictive as this is, I don't want us to merge this and think it is "enough". I also don't know what "correct" here looks like. Let me try to spell out exactly what I would and would not like:
- I would like for us to gather more/better quantitative of AI usage within the Project, and among contributors to the Project.
- I would like for us to continue discussing what a Project-wide policy looks like and find consensus.
- I would like for us to evaluate/monitor actual effectiveness of this policy once we merge.
- Does this "reduce the spam"? Is it easier to moderate? Do maintainers feel less burdened?
- Are there things that we missed in this policy? Are there parts of the policy that just don't seem to be a problem?
- I would like for us to re-evaluate as models change, as the ecosystem changes, as best-practices change, etc.
- I would like for us to identify clearly the problems that we are trying to solve, evaluate alternative solutions than a "restrictive AI policy", and then evaluate how this policy fits with those solutions.
- I would not like for us to assume that this policy is "as permissive as it gets" nor "as restrictive as it gets".
- I would not like for us to broadcast this as an "anti-AI" stance. (Rather, I want to set this as a "we're figuring things out, and we need to focus on maintainers and quality until we do.")
- I would not like for this policy to let us stop treating contributors with respect (regardless of AI use)
- I would not like for us to disregard how AI is used outside the Project and how the policies we set affect our relationships with individuals, companies, and organizations
In all, I think best said:
I don't want us to think of this policy as "done". I want it to be as another stepping stone in figuring out what works. I don't think "only talking" gets us very far (which is why some policy, even if more restrictive or less restrictive than some would like, is still a good step), but I don't think that this is a "solution", only another means to help us figure out what works for the Project. I don't want us to merge this and then any time we are discussing, someone can just point and say: "look, we merged a policy, why are we still discussing this?"
Unfortunately, we're bad at ensuring we don't set something down and forget to pick it up again. A time-limited or event-limited policy can help with this. If we said "this policy is only in effect for a year", then in a year, we must reevaluate whether this policy "worked" and what changes (if any, should be made). I'm not sure what an "event-limited" process would look like: but I could imagine it's some combination of doing a survey, identifying key "events" like e.g. a capable/free "open model" being available, additional tooling being built that could obviate the need for some of this policy, the Project gaining consensus on a Project-wide policy, some team raising a concern, etc.
I imagine what we actually want is some combination. Just taking a stab:
This policy is not set in stone, and can be amended with a simple majority of members of teams using rust-lang/rust (without concerns).
This policy can be dissolved in a few ways:
- Consensus (n-2) of all members of teams using rust-lang/rust (without concerns)
- A formal Project-wide policy in place AND 1 year passing since this policy is first merged
- An objective concern raised about active harm the policy is having on the reputation of Rust, with evidence; as decided by a leadership council FCP (consensus without conerns)
There was a problem hiding this comment.
An objective concern raised about active harm the policy is having on the reputation of Rust, with evidence; as decided by a leadership council FCP
👍 I like the idea of having an escape hatch if there's a crisis.
Consensus (n-2) of all members of teams using rust-lang/rust (without concerns)
I think this is implied, but 👍 to spelling it out explicitly.
A formal Project-wide policy in place AND 1 year passing since this policy is first merged
I don't like that this leaves no room for a project-wide policy that allows teams to set more specific policies.
There was a problem hiding this comment.
If there's a sunset clause, what's the fallback policy? Ideally it's a policy that everyone dislikes, so there's incentive to properly fix it.
The current status quo seems to be... fully permissive but also people will get mad at you if you submit LLM-generated work? That seems less than ideal.
There was a problem hiding this comment.
This is my second point includes not just time passing, but also the Project-wide policy (which is, I guess, the "fallback"). I don't necessarily everyone think has to dislike that, but rather that needs to be something more fundamentally shared across the entire project than a rust-lang/rust specific policy.
The other two points are an active dissolution than fundamentally requires either consensus (same as forming the policy), or evidence of active harm.
|
I very much sympathise with the maintainers who have to deal with slop PRs, but I think this is coming at it the wrong way? If someone opens a concise, easily reviewable change that's a clear positive improvement, but marks it as AI generated, does the policy compel maintainers to reject it? Seems like cutting your nose off to spite your face... Shouldn't the policy focus on giving more flexibility to the maintainers while setting expectations for contributors? (eg. If you use AI in these ways, we may not have the bandwidth to review it and it may be closed without further comment). The way this policy is written at the moment seems like it will open a whole can of worms about fairness (PR A used AI and that was merged against the policy, why was PR B closed?!) which could be avoided by sticking to a policy that focuses on maintainer effort and doesn't try to establish hard and fast rules. I would go as far as to say the policy is entirely backwards:
It's clearly untenable to treat "opening a PR and truthfully reporting it as AI generated" as an instant CoC violation, since people make be making an honest mistake, so some interaction with the contributor is going to be required anyway. Wouldn't it make more sense to base the follow up action on the quality of the PR (Whether that's "Please don't open any more PRs using AI" or "Actually this PR is fine, you must be using AI sensibly for now"). A separate issue I see is the attempt to draw the line at "creation" vs "review" WRT AI usage. In practice what I see professionally is developers using AI on a scale from:
Roughly speaking I would say 1-4 is mostly "solved" by modern AI models. 5-6 is a performance boost (ie. it's quicker with AI but you need to keep on top of what its doing). 7-8 is where you are at serious risk of the AI being more of a time sink than doing it yourself. Interestingly I think actual review with AI is really bad. It's decent at assisting review but the models are far too agreeable to trust its feedback. All of these are "creating" with AI and are thus banned, but there's clearly quite a range... I would say the important thing is not whether the AI was "creating", but the level of autonomy it has. This is also the only thing that's detectable in a PR. |
This is specifically the kind of data I did not want, @jyn514 -- as you say, it's not representative, so I don't know how to interpret it. In some way it is worse than no data at all, since it "suggests" something that may well be false. |
|
Where @jackh726 said...
...my specific proposal would be this:
|
I'm mulling this over. I guess the bottom line is... yeah, I do think this is worse than no policy at all, because it sets a precedent that I think will harm potential contributors and set the project on what I consider to be an overall negative trajectory. In future conversations, there would be no real incentive for those who favor restrictive policies to come to some sort of compromise. I haven't seen much reason to trust that people will be open to discussion in the future. As I wrote before, I feel this policy is the result of a small set of people with strong opinions (myself included) and I think this issue is too important to treat that way. |
There was a problem hiding this comment.
I'd like to start by thanking @jackh726 for being one of the only people so far who's
commented on the substance of the policy rather than the exact wording or whether it's
"anti-" or "pro-" LLM. You brought up some great points and I have changed or clarified the
document in several places as a result.
Niko and Jack, you've asked several times for more quantitative data, both about how people
are using LLMs and how widely concerns are shared. I think that's a good ongoing project
to make sure we're staying "in-touch" with existing and new contributors. However, I don't
think we need that to pass a policy in the first place -- we have an existing mechanism for
establishing consensus, it's an FCP. We'll see there if anyone has concerns.
I think it's important to value new contributors, but we shouldn't do so at the cost of
alienating existing contributors. It doesn't help to welcome new people if there's no one
left here to welcome them.
I haven't seen much reason to trust that people will be open to discussion in the future.
I'm sorry to hear you feel this way. For what it's worth, I think the discussion has gotten
significantly more productive in the last ~3 weeks, and that gives me hope that we can find
a project-wide policy that looks at this with fresh eyes and doesn't just take the
rust-lang/rust policy as gospel. As you know, I proposed a version of that before opening
this more restrictive and narrowly-scoped policy, and it gave significant leeway to teams
outside of rl/r.
I also want to note that a major concern while drafting this policy was the community
reaction to hearing that rust-lang uses LLMs. I have hope that this policy, which is
restrictive but not a full ban, will help establish trust in the community that we're
considering this carefully, so that we see less negative pushback if we do experiment
further.
| This policy is not set in stone. | ||
| We can evolve it as we gain more experience working with LLMs. |
There was a problem hiding this comment.
An objective concern raised about active harm the policy is having on the reputation of Rust, with evidence; as decided by a leadership council FCP
👍 I like the idea of having an escape hatch if there's a crisis.
Consensus (n-2) of all members of teams using rust-lang/rust (without concerns)
I think this is implied, but 👍 to spelling it out explicitly.
A formal Project-wide policy in place AND 1 year passing since this policy is first merged
I don't like that this leaves no room for a project-wide policy that allows teams to set more specific policies.
| Conversely, lying about whether you've used an LLM is an instant [code of conduct](https://rust-lang.org/policies/code-of-conduct/) violation. | ||
| If you are not sure where something you would like to do falls in this policy, please talk to us. | ||
| Don't try to hide it. |
There was a problem hiding this comment.
Good comments all; I've revised this portion significantly.
| Please avoid them where possible. | ||
| In general, existing contributors will be treated more leniently here than new contributors, | ||
| since they've already established trust with their reviewers. | ||
| We may ask you for the original prompts or design documents that went into the LLM's output; | ||
| please have them on-hand, and be available to personally answer questions about your process. | ||
| We may also ask for the exact LLM model used to generate the output. |
There was a problem hiding this comment.
I agree that these are no longer relevant now that we don't allow LLMs for non-trivial code generation. I've reworked this intro significantly, let me know what you think.
| - This does not include "trivial" changes that do not meet the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html), which fall under ⚠️ below. | ||
| We understand that while asking an LLM research questions it may, unprompted, suggest small changes where there really isn't another way to write it. | ||
| However, you must still type out the changes yourself; you cannot give the LLM write access to your source code. | ||
| - We do not accept PRs made up solely of trivial changes. |
There was a problem hiding this comment.
I've reworded this significantly, let me know what you think.
| - Code changes that are originally authored by an LLM. | ||
| - This does not include "trivial" changes that do not meet the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html), which fall under ⚠️ below. | ||
| We understand that while asking an LLM research questions it may, unprompted, suggest small changes where there really isn't another way to write it. | ||
| However, you must still type out the changes yourself; you cannot give the LLM write access to your source code. |
There was a problem hiding this comment.
I've removed the requirement to type out the code yourself.
| LLM reviews, if enabled by a team, **must** be advisory-only. | ||
| Teams can have a policy that code can be merged without review, and they can have a policy that code must be reviewed by at least one person, |
There was a problem hiding this comment.
I actually really want to keep allowing LLM reviews. I think they're low-risk and give people a chance to see whether the bot catches real issues.
I disagree. I think not having a policy will send Rust into a overall negative trajectory. There are lists being created that document projects and organizations that use AI. An example of this is The AI Dirty List, which has the following on their FAQ:
The issue here is if Rust ends up on lists like this one; people will see it, and may say things like[1]:
...and these people will never use or contribute back to Rust. On the other hand, there are networks and organizations who are creating lists of software that are not being made with AI. An example of this is the Starlight Network No-AI List:
They have an entire section dedicated to Programming Languages and Compilers. If Rust ends up on lists like these, users may end seeing that and use/contribute back to Rust. In conclusion: Having a policy that bans AI will bring potential contributors into Rust because they will support a programming language that does not use AI. Having a policy that does not ban AI (or having no policy) will harm Rust because it will drive away potential contributors if they see AI is being used. [1] As far as I am aware, this is not in violation of "do not mention the topic of |
you could argue the opposite way just the same. there are quite a few people that see the benefits of moderate LLM usage. if we ban LLM usage then that would "drive away potential contributors" too. |
|
I feel like making decisions on the basis of uninvolved people who create lists of people/projects to harass, shame and attack is bad actually, regardless of the policy stance that Rust as a project takes. You could make exactly the same argument to avoid inclusive policy and initatives based on the very large contingent of people who hate Rust for being "woke". |
| #### Penalties | ||
| The policies below follow the same guidelines as the code of conduct: | ||
| Violations will first result in a warning, and repeated violations may result in a ban. | ||
| - 🔨 Comments from a personal account originally authored by an LLM | ||
| - 🔨 Violations of the "Be honest" section | ||
|
|
||
| Other violations are left up to the discretion of reviewers and moderators. | ||
| For most cases we recommend closing and locking the PR or issue, but not escalating further. |
There was a problem hiding this comment.
I think it's wrong to treat all violations of AI policy as CoC violations.
When someone submits a PR they wrote with AI assistance and you don't want to merge it, the correct response is just to close the PR and explain to them why it was closed. We should not threaten such contributors with moderation warnings. There really is no reason to tack on a "this is a warning and future violations may result in a ban" on to that explanation. It's an unnecessarily hostile experience to receive that.
Of course, at some point it does become a CoC issue and/or ban-worthy. For example, if you repeatedly don't follow instructions from the maintainers, that's a problem and I think it's fine to ban them for that. Or if it's obviously spam from OpenClaw or whatever, then ban them as spam. That's fine. But we should not treat human contributors like that on first violation.
As an aside on disclosure. I think it's probably right to treat lying as a CoC issue when intentional. This is why I feel somewhat uncomfortable with a disclosure requirement to begin with, though I'm okay with having one, assuming no witch hunts occur.
There was a problem hiding this comment.
Hm, I think I just phrased this poorly. The "code of conduct bit" is only meant to apply to the first two bullets, not to the paragraph on lines 103-104.
There was a problem hiding this comment.
Why does the code of conduct bit extend to this one specifically?
Comments from a personal account originally authored by an LLM
If a new contributor is replying to reviews by copy/pasting from an LLM, I certainly think we should tell them to stop, but I do not think it warrants a moderation warning, for the reasons I outlined above. I do not think this is at all comparable to outright lying.


Summary
This document establishes a policy for how LLMs can be used when contributing to
rust-lang/rust. Subtrees, submodules, and dependencies from crates.io are not in scope. Other repositories in therust-langorganization are not in scope.This policy is intended to live in Forge as a living document, not as a dead RFC. It will be linked from
CONTRIBUTING.mdin rust-lang/rust as well as from the rustc- and std-dev-guides.Moderation guidelines
This PR is preceded by an enormous amount of discussion on Zulip. Almost every conceivable angle has been discussed to death; there have been upwards of 3000 messages, not even counting discussion on GitHub. We initially doubted whether we could reach consensus at all.
Therefore, we ask to bound the scope of this PR specifically to the policy itself. In particular, we mark several topics as out of scope below. We still consider these topics to be important, we simply do not believe this is the right place to discuss them.
No comment on this PR may mention the following topics:
We have asked the moderation team to help us enforce these rules.
Feedback guidelines
We are aware that parts of this policy will make some people very unhappy. As you are reading, we ask you to consider the following.
If your concern is for yourself or for your team
rust-lang/rust. Other repositories are not affected by this policy and are therefore not in scope.Previous versions of this document were discussed on Zulip, and we have made edits in responses to suggestions there.
Motivation
rust-lang/rustis currently dealing with a deluge of low-effort "slop" PRs primarily authored by LLMs.This policy is not intended as a debate over whether LLMs are a good or bad idea, nor over the long-term impact of LLMs. It is only intended to set out the future policy of
rust-lang/rustitself.Drawbacks
Rationale and alternatives
rust-lang/rust. This has the advantage that everyone knows what the policy is everywhere, and that it's easy to make things part of the mono-repo at a later date. It has the disadvantage that we think it is nigh-impossible to get everyone to agree. There are also reasons for teams to have different policies; for example, the standard for correctness is much higher within the compiler than within Clippy.Prior art
This prior art section is taken almost entirely from Jane Lusby's summary of her research, although we have taken the liberty of moving the Rust project's prior art to the top. We thank her for her help.
Rust
Other organizations
These are organized along a spectrum of AI friendliness, where top is least friendly, and bottom is most friendly.
Unresolved questions
See the "Moderation guidelines" and "Drawbacks" section for a list of topics that are out of scope.
Rendered