Skip to content

Comments

feat: upgrade existing filesystem #264

Open
upils wants to merge 55 commits intocanonical:mainfrom
upils:recut
Open

feat: upgrade existing filesystem #264
upils wants to merge 55 commits intocanonical:mainfrom
upils:recut

Conversation

@upils
Copy link
Collaborator

@upils upils commented Jan 29, 2026

  • Have you signed the CLA?

This commit enables Chisel detecting the target directory contains the result
of a previous execution to then operate an upgrade of the content. This initial
simple implementation has the the limitation that content (files, symlinks) is
systematically replaced, even if identical. As a consequences:

  • user modifications on chisel-managed content is overridden.
  • in the context of a OCI image build, this "new" content is identified as
    different and thus the new OCI layer contains duplicated files.

@github-actions
Copy link

github-actions bot commented Jan 29, 2026

Command Mean [s] Min [s] Max [s] Relative
BASE 13.382 ± 0.054 13.304 13.505 1.02 ± 0.01
HEAD 13.060 ± 0.041 13.009 13.148 1.00

@upils upils changed the title feat: update existing filesystem feat: upgrade existing filesystem Feb 2, 2026
@upils upils requested a review from letFunny February 2, 2026 14:35
@upils upils marked this pull request as ready for review February 2, 2026 14:35
Copy link
Collaborator

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review. I am missing a couple of files but I wanted to publish it as quickly as possible so we have more time to iterate. This is looking great Paul, many things are looking dead simple which is always an extremely good sign. I have written some comments about how to make the PR shorter and how to make it, hopefully, even simpler. Let me know what you think.

I see some of the comments are similar to past discussions we had in upils#1. We spent some effort there, so please make sure all the comments were carried over to this one.

}
}

mfest, err := slicer.SelectValidManifest(cmd.RootDir, release)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general our philosophy is that we should be able to cut a release from main at any point in time. That is, if we ever do a bug fix we should be able to release a point version. This is changing how Chisel works in folders with a manifest and it is not the correct behavior (yet) so I would prefer if we could gate this functionality over a environmental variable or a debug command or any other mechanism.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. See a259d1d

targetDir = filepath.Join(dir, targetDir)
}

var originalTargetDir string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A different way of doing it is to extract what used to be the Run function into another one and have Run call into that with different target dir and then upgrade (see extract.go for Run vs extractData). Do you think it will be clearer this way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong opinion for one or the other. It is unclear to me what lead extractData to be split into it's own function. Can you share the reason?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already like that when I joined :). IMO it is better to split than to carry more data around and "fake" it bypassing the interface by overriding target and copying original.

It is also a cleaner abstraction as it doesn't need the rest of the code and will make the function shorter (even if marginally).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See fad2cb7

mfestSchema := db.Schema()
if mfestSchema != Schema {
return nil, fmt.Errorf("unknown schema version %q", mfestSchema)
return nil, fmt.Errorf("%w %q", ErrUnknownSchema, mfestSchema)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Note to reviewer]: It looks a bit off to me but I did not want to break compatibility since this error is part of the public API.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have to maintain compatibility on the message contents, no consumer should depend on that. I know about hyrum law and what not. But we are not Go standard library and we cannot fix error messages forever. If we did that then all errors messages are fixed forever, here and in internal.

There are multiple ways to create named errors in Go, this is one of them. I think you can make it work with a nice message.

@upils upils requested a review from letFunny February 11, 2026 14:49
Copy link
Collaborator

@letFunny letFunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this Paul. Most of the comments are minor which should tell you that the code is almost ready, kudos for that. I have several important comments that I think will affect the correctness of the solution so please read those carefully.

Also be sure to capture our discussions in 1:1 in a comment in Github or something so that we don't lose track. I wrote a comment stating a part we discussed that was not here, I think that was all of it.

Comment on lines 171 to 174
// TODO: Detect if existing content is a file. ErrExist is also returned
// if a file exists at this path, so returning nil here creates an
// inconsistency between our view of the content and the real content on
// disk.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you state in a comment in the PR that this is a bug that we found while implementing this and will be fixed in another PR? And can you also add the bug for symlinks I told you about the other day in the MM chat?

Last thing, in the comment you wrote can you say explicitly "bug". This is not an enhancement it is a bug that leads to an inconsistency as you correctly point out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 10a1744

},
noMatch: true,
}, {
summary: "Unknown schema error ignored",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMPORTANT: I am not seeing any test about what we discussed this the other day so let's make sure we don't lose the context (I may have missed it). I don't think this is correct. I think this should fail for the reasons I stated the other day:

  • case 1, we find a valid and an optional invalid manifest. This means we upgraded the format and we produced two versions for compatibility.
  • case 2, we find only invalid manifests. This means the new versions of Chisel didn't produce backwards compatible manifests. You are ignoring the error, meaning old Chisel will do a cut which is a mistake given that there was a previous one. Imagine when rockcraft updates its chiselled base layer.
  • case 3, we find no manifests. This is clearly a first cut, proceed normally.

We are losing information about the previous cut by grouping case 2 and case 3 into the same one, and to me they are NOT the same.

Lastly, this gives us the option in the future to either break compatibility or not. In the current code we either force old versions to ignore re-cut or we are compatible. I prefer the error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that indeed. I agree with the proposed strategy, see 46f872c. I may revisit that with a clearer mind.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion but I think having a flag that is set after os.Open will make it more clear and less specific. Something like found set to true if one call to os.Open succeeds. And it definitely needs a comment in the implementation.

Lastly, a nipick but I would reword the error message to say "all manifests" which includes also 1 (in my opinion) and several. This makes it clear we checked all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion but I think having a flag that is set after os.Open will make it more clear and less specific. Something like found set to true if one call to os.Open succeeds. And it definitely needs a comment in the implementation.

Do you mean instead of foundUnknownSchema? I think your proposition would solve a different problem, and not deal with schema version compatibility. What case would that handle?

Lastly, a nipick but I would reword the error message to say "all manifests" which includes also 1 (in my opinion) and several. This makes it clear we checked all.

Good point, I will fix it.

mfestSchema := db.Schema()
if mfestSchema != Schema {
return nil, fmt.Errorf("unknown schema version %q", mfestSchema)
return nil, fmt.Errorf("%w %q", ErrUnknownSchema, mfestSchema)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have to maintain compatibility on the message contents, no consumer should depend on that. I know about hyrum law and what not. But we are not Go standard library and we cannot fix error messages forever. If we did that then all errors messages are fixed forever, here and in internal.

There are multiple ways to create named errors in Go, this is one of them. I think you can make it work with a nice message.

test -s ${ROOTFS}/etc/debian_version
test -s ${ROOTFS}/etc/issue
cat ${ROOTFS}/etc/foo | grep "qux"
test -f ${ROOTFS}/chisel/manifest.wall
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could check qux being included in the manifest if you wanted to. I don't have a strong opinion on how extensive the spread tests should be.

if !os.IsExist(err) {
return err
}
err = os.RemoveAll(dstPath)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Note to reviewer]: This can silently remove a directory full of user content. This is inline with the simple strategy implemented in this PR but it should likely evolve when a more fine-grained strategy is implemented to "spare" user content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants