loader: implement package maps#62239
Conversation
|
Review requested:
|
|
I like the idea, it would greatly reduce the amount of filesystem operations that pnpm has to do in order to create an isolated node_modules layout using symlinks. I also suggested arcanis to possibly go one layer deeper and allow to map the individual files of packages. This would allow to map node_modules directly from a content-addressable store (that consists of package files). Of course, that would increase the size of the file several times but it would also make installation even faster. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #62239 +/- ##
==========================================
+ Coverage 90.34% 90.36% +0.02%
==========================================
Files 732 733 +1
Lines 236676 237055 +379
Branches 44591 44655 +64
==========================================
+ Hits 213827 214222 +395
+ Misses 14569 14547 -22
- Partials 8280 8286 +6
🚀 New features to boost your workflow:
|
|
This seems quite close to the |
|
I did, but felt that the semantics were too different; import maps have two fields:
Neither of those match the semantics we need, and reusing them just for their name but with different semantics would have been imo misleading for third-party resolver implementors. |
|
Love it. I'll try to give a detailed review on the flight home today if the in flight wifi treats me kindly. |
|
What happens if two packages define the exact same path? An error, presumably, given that the algorithm relies on being able to determine for each path which package it belongs to? Needs a test in any case. |
I'll follow-up with a separate improvement to key module instances per both their path and their package IDs (it's key to solve the peer dependency problem I mentioned in the PR description), but for this iteration only one package ID per path is supported. I updated the code to throw an error accordingly. |
|
@guybedford @avivkeller can I get another review? once this land I'll start looking at a prototype for generating package maps in package managers, and the package ID follow-up |
|
|
||
| In the example above both `lib-old` and `lib-new` use the same `./lib` folder to | ||
| store their sources, the only difference being in which version of `react` they'll | ||
| access when performing `require` calls or using `import`. |
There was a problem hiding this comment.
How does this work given the module cache? Wont the first one to load’s react version “win”?
There was a problem hiding this comment.
Yes, that's why it's not implemented in this PR (it throws should this case happen), and will be implemented as a follow-up: #62239 (comment) (still I want it to be documented as third-party tools reading package maps better know about this early in the implementation)
|
What happens when there's symlinks in the specified paths? From the current implementation it looks like it basically breaks. You could sort of fix this by resolving symlinks in the package map (at the cost of a lot of file system operations during startup), but then you have the problem that the "longest path containing a file" heuristic is now ambiguous: you could have a file {
"packages": {
"my-app": {
"path": "./src",
"dependencies": {
"foo": "foo",
"baz": "baz"
}
},
"foo": {
"path": "/foo/bar"
},
"baz": {
"path": "/baz"
}
}
}and then I don't have a good intuition for what the right thing to do here is. |
It depends where the symlink is:
|
There was a problem hiding this comment.
Is it possible for @arcanis or someone else familiar with the work to come to the next TSC meeting to talk a little about this implementation? I have two concerns with landing this at the moment:
- Awareness: like mentioned in the "Why?" section of its description, this PR is introducing a complete new alternative to the more-than-a-decade old module resolution system, have APM folks been looped in? were the other adjacents WG consulted? @nodejs/package-maintenance @nodejs/tooling
- Import maps: Is the current implementation going to create an irremediable compatibility with that standard? Can this use some help from the @nodejs/wintercg folks to help standardize this so that other runtimes may also implement support in the future?
That said, thanks @arcanis for taking the time to work on it! I feel like the runtime definitely needs to innovate on this and sorry to be potentially slowing things down in the immediate time but hopefully it's to make sure we land on a solution that will last! ✌️
|
I have been looking for someone who wanted to take over and help get #49443 over the finish line. Maybe this would be a good opportunity to start a thread in with @nodejs/wintercg, come up with the technical and UX goals we have for this feature. I think we need to decide if we need spec changes (to pursue via the standards process) or if we can find workarounds that maintain spec compliance. @arcanis happy to work with you on this if you are interested. |
Sure, happy to, please invite me to the next meeting. I also discussed it a little while ago with folks at the WinterTC and it wasn't clear at the time whether it was worth a multi-vendor discussion at that stage. So I figured that since this is behind
The designs serve different goals. The way import maps are keyed makes it impossible to support very common dependency graphs, causing package managers to generate semantically incorrect installs or rely on features like inject or virtual paths which add complexity and come with their own trade-offs. That's not to say import maps can't be implemented in Node.js as well, but their design limits them to solve more web-specific use cases that wouldn't satisfy the practical problems our local package managers set out to address. |
|
Using the example in "How it works" as base, some compatibility/security questions:
Context: I'm looking at it from the experience of having to map the dependencies for LavaMoat policy files and these are some of the reasons we were forced to invent canonical names that represent the package as the shortest path in dependency tree leading to it, and we use that as keys in a flat structure. |
Separate arbitrary IDs. The graph is stored flat in {
"packages": {
"my-app": {
"path": "./src",
"dependencies": {
"lodash": "lodash@1",
"react": "react"
}
},
"lodash@1": {
"path": "./node_modules/lodash"
},
"lodash@2": {
"path": "./node_modules/react/node_modules/lodash"
},
"react": {
"path": "./node_modules/react",
"dependencies": {
"lodash": "lodash@2"
}
}
}
}
Separate IDs as well; from my experience with Yarn we give different identities to packages that come from different sources (git, npm, http, etc). As an example we also don't deduplicate them during hoisting.
I'd treat bundleDependencies as their own source, so supporting package managers would give them their own ID. |
|
Should this still stay on the TSC agenda? If the idea is to get it on the radar of the TSC, the tsc-agenda label may have run its course after being in the agenda for over a month. I will remove the label for now to avoid piling items that won't receive discussions in the agenda, if there are any updates, feel free to add the label back. |
On my end I consider this PR as ready to be merged; I think it got more than enough reviews, plenty of attention, and I want to start progressing on integrating package maps in our tools. It just needs someone to help me rerun flaky CI tests and merge, I don't have permissions to do that. Some folks may also be interested in this blog post I wrote explaining in more details why import maps aren't a compatible solutions to the problems I set out to solve. I hope that this background can help establish a general consensus, if not unanimity. |
|
What's the next step here? What's required to move this forward? |
There was a problem hiding this comment.
I'm in favor of the proposal. I think the runtime should be package manager agnostic and leave to each package manager the freedom of choosing how to organize the dependency tree and where to store it. I think there are still edge cases, for example I think the 'no type stripping in node_modules' rule, but I'd be happy to see this land and be iterated on.
Please drop the merge commit and rebase
|
Just wanted to put in a major kudos here for the work on this. In an age where supply chain attacks + forks for unsupported/EOL projects is increasingly common, a simple hook like this is incredibly useful in so many cases. Still coming up to speed, but happy to help contribute to the effort if helpful |
4d21301 to
dc615c7
Compare
Signed-off-by: Mael Nison <nison.mael@gmail.com>
dc615c7 to
9c25918
Compare
|
there is still the block from @wesleytodd |
|
Ping @wesleytodd |
|
I caught up on the TSC stream but was not aware this was going to be discussed today until too late. I am happy to hear @arcanis's update on compatibility across the specs and that it was discussed on tc55. I absolutely still stand by my comment above and believe import maps can solve for the same use cases that package maps solve for. I am happy to try and attend a discussion if I am aware of it in advance if folks wanted, but I do not have time or energy right now to engage on this topic and so will retract my intent to block this. It won't be the first or the last time the project ships something I disagree with and that is good. I just hope this conflict with the web spec does not turn into another ESM or EDIT: looks like I don't have permission to dismiss my review. Can someone take care of that for me? |
Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
This PR adds a new
--experimental-package-map=<path>flag letting Node.js resolve packages using a static JSON file instead of walkingnode_modulesdirectories.Why?
The
node_modulesresolution algorithm predates npm and its clear definition of the concept of packages. It works well enough and is widely supported, but has known issues:Phantom dependencies - packages can accidentally import things they don't declare, because hoisting makes transitive dependencies visible
Peer dependency resolution is broken in monorepos - if
website-v1usesreact@18andwebsite-v2usesreact@19, and both use a sharedcomponent-libwith React as a peer dep, there's nonode_moduleslayout that resolves correctly. The shared lib always gets whichever React was hoisted.Hoisting is lossy - runtimes can't tell if an import is legitimate or accidental
Resolution requires I/O - you have to hit the filesystem to resolve packages, and that quickly adds up. Even more so in agentic worlds where working with multiple git trees become a common pattern.
Package managers have tried workarounds (pnpm symlinks, Yarn PnP), but are either limited by what the filesystem itself can offer (like symlinks) or by their complexity and lack of standardization (like Yarn PnP). This PR offers a mechanism for such tools to solve the problems listed above in tandem with Node.js.
How it works
A
package-map.jsondeclares packages, their locations (relative to the package map), and what each can import:{ "packages": { "my-app": { "path": "./src", "dependencies": { "lodash": "lodash", "react": "react" } }, "lodash": { "path": "./node_modules/lodash" }, "react": { "path": "./node_modules/react" } } }When resolving a bare specifier:
dependenciespathERR_PACKAGE_MAP_ACCESS_DENIEDMODULE_NOT_FOUNDCompatibility
An important aspect of the package maps feature that separates it from competing options like Yarn PnP is its builtin compatibility with
node_modulesinstalls. Package managers can generate bothnode_modulesfolders ANDpackage-map.jsonfiles, with the later referencing paths from the former.Tools that know how to leverage
package-map.jsoncan then use this pattern for both static package resolution and strict dependency checks (with optional fallbacks to hoisting if they just wish to use the package map information to emit warnings rather than strict errors), whereas tools that don't will fallback to the classicalnode_modulesresolution.Differences with import maps
Issue #49443 requested to implement import maps. In practice these aren't a good fit for runtimes like Node.js for reasons described here and which can be summarized as: import maps take full ownership of the resolution pipeline by spec, thus preventing implementing additional runtime-specific behaviours such as
exportsorimportsfields.This PR comes as close from implementing import maps as possible but with a very light difference in design making it possible to stay compatible with other Node.js resolution features.
Why not a loader?
The ecosystem now has to deal with a variety of third-party resolvers, most of them not implementing the loader API for many different reasons: too complex, turing-complete, or dependent on a JS runtime.
After I've been following this path for more than six years I can confidently say that loaders would work for Node.js itself but wouldn't be standard enough to be included in at least some of those popular third-party tools.
Package manager integration
Package maps are meant to be generated by package managers. For example you could imagine Yarn or pnpm generating a
node_modulesdirectory containing anode_modules/.package-map.jsonfile should the feature be enabled (it would likely remain opt-in while experimental).Then all
yarn nodecalls would inject the flag into the environment when spawning Node.js:Questions
--experimental-strict-package-mapsis set? Or via astrictfield inpackage-map.json.