-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
PEP 833: Freezing the HTML simple repository API #4930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+234
−0
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
ad93554
Draft for freezing the HTML simple repository API
woodruffw 1ab5a02
Assign PEP 833
woodruffw 6030e4a
Revert "Assign PEP 833"
woodruffw ea3a7a2
Reapply "Assign PEP 833"
woodruffw f24f1b3
Apply suggestions from code review
woodruffw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,233 @@ | ||
| PEP: 833 | ||
| Title: Freezing the HTML simple repository API | ||
| Author: William Woodruff <william@yossarian.net> | ||
| Sponsor: Donald Stufft <donald@stufft.io> | ||
| PEP-Delegate: Donald Stufft <donald@stufft.io> | ||
| Discussions-To: Pending | ||
| Status: Draft | ||
| Type: Standards Track | ||
| Topic: Packaging | ||
| Created: 21-Apr-2026 | ||
| Post-History: `13-Apr-2026 <https://discuss.python.org/t/106959>`__ | ||
|
|
||
|
|
||
| Abstract | ||
| ======== | ||
|
|
||
| This PEP proposes freezing the | ||
| :ref:`standard HTML representation <packaging:simple-repository-html-serialization>` | ||
| of the simple repository API, as originally specified in :pep:`503` | ||
| and updated over subsequent PEPs. | ||
|
|
||
| In this context of this PEP, "freezing" means that the HTML representation | ||
| is considered complete from the perspective of the standards process, | ||
| and **SHOULD NOT** be updated by future PEPs. Future PEPs **SHOULD** instead | ||
| target the | ||
| :ref:`standard JSON representation <packaging:simple-repository-api-json>`, | ||
| as originally specified in :pep:`691`. | ||
|
|
||
| Similarly, this PEP's freezing of the HTML representation does **not** stipulate | ||
| that installers should remove support for the HTML representation, or that | ||
| indices (like PyPI) will or should stop providing an HTML representation. | ||
|
|
||
| Rationale and Motivation | ||
| ======================== | ||
|
|
||
| The use of an HTML representation for Python package indices predates | ||
| efforts to standardize Python packaging. Consequently, the HTML representation | ||
| standardized with :pep:`503` represents a *formalization* of | ||
| existing practices (particularly those of PyPI), rather than a *design*. | ||
|
|
||
| The HTML representation of a Python package index has served the Python | ||
| packaging ecosystem admirably: it has acted as the baseline representation | ||
| that all indices and installers support, and has allowed PyPI to incrementally | ||
| modernize its index presentation while maintaining backwards compatibility | ||
| with installers and mirrors. :pep:`629`, :pep:`714`, :pep:`740`, | ||
| :pep:`792`, and many others demonstrate the viability of this approach. | ||
|
|
||
| At the same time, the HTML representation has several limitations that | ||
| have become increasingly apparent and salient as Python packaging as a whole | ||
| has modernized: | ||
|
|
||
| - The HTML representation is *rigid*, for backwards compatibility reasons. | ||
| This rigidity makes it difficult to represent new pieces of metadata, | ||
| and PEPs that attempt to do so typically need to shoehorn their changes | ||
| into ``<meta>`` tags or ``data-`` attributes to avoid interfering with | ||
| assumptions that existing consumers make about the structure of the HTML. | ||
|
|
||
| This shoehorning process also requires PEPs that modify the HTML index | ||
| to invent syntax for encoding structured data. For example, :pep:`792` | ||
| adds meta tags named ``pypi:project-status`` and | ||
| ``pypi:project-status-reason``, effectively flattening an object | ||
| representation that appears naturally in the JSON representation. | ||
|
|
||
| Similarly, the HTML representation's rigidity makes it an optimization | ||
| barrier: :pep:`658` allows indices to serve distribution metadata via | ||
| the simple repository API, but the absence of a straightforward and | ||
| backwards-compatible way to encode that metadata within the HTML | ||
| representation means that installers must incur an additional HTTP round-trip | ||
| to fetch relatively small amounts of information. :pep:`740` adopts a | ||
| similar approach, with similar overhead repercussions. | ||
|
|
||
| In practice, some index PEPs have chosen not to modify the HTML representation | ||
| at all, and instead focus solely on the JSON representation. :pep:`700` | ||
| for example introduces both per-distribution metadata *and* a top-level | ||
| ``versions`` key to the JSON representation, but does not modify the HTML | ||
| representation. The original rationale for this was that HTML consumers | ||
| would be unlikely to need the new metadata, | ||
|
|
||
| - Relatedly, third-party consumption of the HTML representation is often | ||
| *brittle*: even syntactically valid, non-semantic changes to PyPI's HTML | ||
| representation are | ||
| `known to cause breakage <https://github.com/pypi/warehouse/issues/18275>`__ | ||
| due to unsound assumptions about the exact structure of the HTML, including | ||
| its whitespace. | ||
|
|
||
| Consumption of the JSON representation, by contrast, is more robust to | ||
| non-semantic changes thanks to the prevalence of robust JSON parsing | ||
| libraries. Robust handling of HTML is naturally possible, but consumers | ||
| are often *tempted* to avoid the perceived complexity and generality | ||
| of HTML parsing in favor of brittle approaches involving regular expressions | ||
| and similar ad-hoc parsing techniques. | ||
|
|
||
| - In practice, *adoption* of incremental improvements to the HTML representation | ||
| is limited: PyPI itself typically adopts new features, but third-party | ||
| indices (particularly those sold as corporate offerings) frequently provide | ||
| only the absolute minimum representation originally defined in :pep:`503`. | ||
|
|
||
| As a result, *even when* the HTML representation is improved, many consumers | ||
| do not benefit from those improvements. | ||
|
|
||
| Put together, these limitations mean that the HTML representation is (1) | ||
| often difficult to extend in a robust way, (2) *de facto* frozen with | ||
| respect to how many consumers interact with Python packaging, even | ||
| when standards processes work to modernize it. | ||
|
|
||
| The purpose of this PEP is to formalize this status quo. | ||
|
|
||
| Specification | ||
| ============= | ||
|
|
||
| The HTML representation of the simple repository API is frozen | ||
| for the purposes of Python packaging standards processes. Future | ||
| Python packaging PEPs **SHOULD NOT** modify the HTML representation of the | ||
| simple repository API, and **MUST** instead modify the JSON representation. | ||
|
|
||
| This PEP does not alter the status of the HTML representation on PyPI | ||
| and does not prescribe any behavioral changes for installers. | ||
|
|
||
| One functional consequence of this freeze is that future changes | ||
| to the simple repository API will be | ||
| :ref:`versioned <packaging:simple-repository-api-versioning>` as they are | ||
| currently, but that only the JSON representation will receive changes | ||
| to its versioning marker. For example, if a future PEP introduces | ||
| version 1.5 of the simple repository API, the HTML representation will retain | ||
| the following versioning marker: | ||
|
|
||
| .. code-block:: html | ||
|
|
||
| <meta name="pypi:repository-version" content="1.4"> | ||
|
|
||
| Future Considerations | ||
| ===================== | ||
|
|
||
| This PEP does not stipulate any changes to how indices and installers should | ||
| handle the HTML representation. | ||
|
|
||
| As of April 2026, the prospect of *fully* removing support for the HTML | ||
| representation from either indices or installers is unrealistic: it is simply | ||
| too critical to the ecosystem, and efforts to remove it would be extremely | ||
| and unreasonably disruptive. | ||
|
|
||
| However, it is not *inconceivable* that the HTML representation could be | ||
| fully removed (or relegated to legacy/default-disabled flows) in the future. | ||
| This PEP does not preclude such a future, but does not propose it either. | ||
|
|
||
| The Python packaging community has made several valuable observations | ||
| around behaviors that make outright removal of the HTML representation | ||
| difficult or infeasible, including: | ||
|
|
||
| - By virtue of being the default, the HTML representation is extremely | ||
| easy to adopt internally: it doesn't require any (explicit) content | ||
| negotiation, and can often be served trivially by a CDN or a minimal | ||
| HTTP server (like ``python -m http.server``). | ||
|
|
||
| The JSON representation does not technically require content negotiation | ||
| either, but in practice clients that consume it expect to perform | ||
| explicit content negotiation due to the assumption that the same URL | ||
| provides both representations. Consequently, any future efforts to remove the | ||
| HTML representation will likely require a simpler adoption story for the JSON | ||
| representation. | ||
|
|
||
| - The HTML representation is currently easier for installers like pip | ||
| to parse incrementally, as the Python standard library includes | ||
| ``html.parser`` for incremental HTML parsing. This helps mitigate | ||
| the memory overhead of large HTML index responses, e.g. detail responses | ||
| for packages that have hundreds or thousands of distributions. | ||
|
|
||
| By contrast, Python's standard library currently lacks an incremental | ||
| JSON parser. Incremental JSON parsing is not impractical (and is strictly | ||
| less complex than incremental HTML parsing), but the absence of a | ||
| standard library solution presents an adoption barrier. | ||
| Future efforts to remove the HTML representation will likely require a robust | ||
| standard library (or acceptably vendorable third-party) solution for | ||
| incremental JSON parsing within pip. | ||
|
|
||
| Security Implications | ||
| ===================== | ||
|
|
||
| This PEP does not identify any positive or negative security implications | ||
| associated with freezing the HTML representation of the simple repository | ||
| API. | ||
|
|
||
| How to Teach This | ||
| ================= | ||
|
|
||
| Because this PEP only freezes the HTML representation of the simple repository | ||
| API for the purposes of Python packaging standards processes, the end user | ||
| implications of this PEP are limited. | ||
|
|
||
| However, for third-party indices that wish to modernize their index | ||
| representations, this PEP proposes the following if accepted: | ||
|
|
||
| - The authors of this PEP will coordinate with the maintainers | ||
| of PyPI on appropriate public-facing documentation and communication, | ||
| including an announcement on the `PyPI blog <https://blog.pypi.org>`__ | ||
| if deemed appropriate. | ||
|
|
||
| - The authors of this PEP will make appropriate changes to the | ||
| :ref:`living standard <packaging:simple-repository-api>` for the simple | ||
| repository API, including admonitions and callouts where appropriate | ||
| to indicate that the HTML representation will not receive future updates. | ||
|
|
||
| Rejected Ideas | ||
| ============== | ||
|
|
||
| Doing nothing | ||
| ------------- | ||
|
|
||
| Doing nothing is always an option. Per above, this would be a continuation | ||
| of the status quo, wherein the HTML representation is updated on paper | ||
| (and on PyPI), but is frozen in practice in third-party settings. | ||
|
|
||
| The authors of this PEP believe that being explicit about the status | ||
| of the HTML representation is valuable, and would benefit future standards | ||
| efforts by diverting design effort away from shoehorning new features | ||
| into the HTML representation. | ||
|
|
||
|
|
||
| Aggressively removing the HTML representation | ||
| --------------------------------------------- | ||
|
|
||
| Encouraging indices and installers to aggressively remove support for the HTML | ||
| representation is another option. However, as noted above, this is unrealistic | ||
| in the near term, and would be disruptive to the ecosystem. | ||
|
|
||
| The authors of this PEP believe that freezing is a more gradual and | ||
| pragmatic approach that better reflects the ecosystem's reality. | ||
|
|
||
| Copyright | ||
| ========= | ||
|
|
||
| This document is placed in the public domain or under the CC0-1.0-Universal | ||
| license, whichever is more permissive. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.