Support image tar, without accessing Docker daemon by vulyon · Pull Request #256 · goldmann/docker-squash

vulyon · 2025-07-04T07:43:51Z

Enable Docker Daemon-Free Image Squashing

This PR directly addresses and resolves Issue #24: "Make it possible to run squashing without accessing Docker daemon" by introducing the ability for docker-squash to directly process Docker images from tar files. This eliminates the need for a running Docker daemon, significantly enhancing flexibility for image optimization in CI/CD pipelines, air-gapped environments, and systems without Docker installed.

Key Benefits

No Docker Daemon Required: Squash images anywhere you can save them as a tar file.
Ideal for Restricted Environments: Works seamlessly in CI/CD, air-gapped setups, or when Docker isn't running.
OCI Format Support: The tool automatically detects and processes images from OCI-formatted input tar files. Please note: Currently, testing has been focused solely on OCI format images.
Preserves Layer History: Ensures compatibility and traceability for your squashed images.

How to Use

Export the Image:

$ docker save -o source.tar jboss/wildfly:latest

Squash from Tar:
```
$ python -m docker_squash.cli --input-tar source.tar --tag jboss/wildfly:squashed -f 8 --output-path squashed.tar --load-image false
```
- Use --input-tar for your source image file.
- --tag is recommended for the new image name.
- --output-path specifies where to save the squashed tar file.
- --load-image false prevents the tool from attempting to load the image directly into a Docker daemon.
Load into Docker (Optional):
```
$ docker load -i squashed.tar
```

This enhancement significantly broadens docker-squash's utility, making image size optimization more accessible across diverse development and deployment scenarios.

vulyon · 2025-07-04T08:13:04Z

hello,sir @goldmann @rnc it appears there's an issue with the runner environment. It looks like the CI/CD checks are failing due to the Ubuntu 20.xx runner deprecation (as per the error message). This doesn't seem to be related to my code changes.

rnc · 2025-07-04T14:52:28Z

@lyon-v I've fixed the runners and rebased. However I think you need to run ./support/run_formatter.py to fixup the formatting as well.

vulyon · 2025-07-06T09:54:53Z

@rnc Hi there! Apologies for the delayed response. I've fixed the code formatting and it has passed the checks now.

vulyon · 2025-07-07T02:44:48Z

Sir @rnc @goldmann .Do I need to squash these two commits into a single one?

rnc

In general I think this is a great idea. However there are some areas I have questions/comments on. And tests are also required please. Thanks very much for the PR!

rnc · 2025-07-14T09:58:03Z

README.rst

+
+::
+
+    $ python -m docker_squash.cli --input-tar source.tar --tag jboss/wildfly:squashed -f 8 --output-path squashed.tar --load-image false


I think this should be run without the -f parameter as the log out below has the squashed image larger than the original which is a confusing result for a README. Also, if both docker squash and tar squash have an example showing the same result IMHO its more inituitive.

Because, jboss/wildfly:latest this image has changed.

(base) root@master:~# docker pull jboss/wildfly:latest
latest: Pulling from jboss/wildfly
f87ff222252e: Pull complete
8116b2f7ca5a: Pull complete
0b43aea4eeb1: Pull complete
13776e8da872: Pull complete
f26d32e28c29: Pull complete
Digest: sha256:35320abafdec6d360559b411aff466514d5741c3c527221445f48246350fdfe5
Status: Downloaded newer image for jboss/wildfly:latest
docker.io/jboss/wildfly:latest

(base) root@master:~# docker history jboss/wildfly:latest
IMAGE CREATED CREATED BY SIZE COMMENT
35320abafdec 3 years ago /bin/sh -c #(nop) CMD ["/opt/jboss/wildfly/… 0B
3 years ago /bin/sh -c #(nop) EXPOSE 8080 0B
3 years ago /bin/sh -c #(nop) USER jboss 0B
3 years ago /bin/sh -c #(nop) ENV LAUNCH_JBOSS_IN_BACKG… 0B
3 years ago /bin/sh -c cd $HOME && curl -L -O https:… 270MB
3 years ago /bin/sh -c #(nop) USER root 0B
3 years ago /bin/sh -c #(nop) ENV JBOSS_HOME=/opt/jboss… 0B
3 years ago /bin/sh -c #(nop) ENV WILDFLY_SHA1=238e67f4… 0B
3 years ago /bin/sh -c #(nop) ENV WILDFLY_VERSION=25.0.… 0B
4 years ago /bin/sh -c #(nop) ENV JAVA_HOME=/usr/lib/jv… 0B
4 years ago /bin/sh -c #(nop) USER jboss 0B
4 years ago /bin/sh -c yum -y install java-11-openjdk-de… 239MB
4 years ago /bin/sh -c #(nop) USER root 0B
4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldmann… 0B
4 years ago /bin/sh -c #(nop) USER jboss 0B
4 years ago /bin/sh -c #(nop) WORKDIR /opt/jboss 0B
4 years ago /bin/sh -c groupadd -r jboss -g 1000 && user… 406kB
4 years ago /bin/sh -c yum update -y && yum -y install x… 33.5MB
4 years ago /bin/sh -c #(nop) MAINTAINER Marek Goldmann… 0B
5 years ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0B
5 years ago /bin/sh -c #(nop) LABEL org.label-schema.sc… 0B
5 years ago /bin/sh -c #(nop) ADD file:61908381d3142ffba… 222MB

I will fix the readme.rst

rnc · 2025-07-14T10:02:58Z

docker_squash/cli.py

+        parser.add_argument(
+            "--input-tar",
+            help="Path to tar file created by 'docker save'. Process tar file directly without requiring Docker daemon.",
+        )


I think we should investigate using exclusive groups for argparse - as that has built in support for having either the --input-tar or image option and would avoid the manual checks below.

Also - I think its valid for output-path to be the same as input-tar (?) , should, in tar mode, this be the default?

Great ! I have the code changes.

rnc · 2025-07-14T10:15:38Z

docker_squash/tar_image.py

+
+    def __init__(
+        self, log, tar_path, from_layer=None, tmp_dir=None, tag=None, comment=""
+    ):


TarImage derives from Image (which is good) but isn't calling super. Further I think it duplicates some code from image.py (and potentially v2_image). Could there be more attempt at normalising the code to avoid duplication?

yes, sir. I will fix this

rnc · 2025-07-14T10:18:13Z

README.rst

+- Works in CI/CD pipelines and restricted environments  
+- Supports both Docker format and OCI format images
+- Maintains complete layer history compatibility
+- Can process images on systems where Docker is not installed


I would imagine that its helpful when working with podman as well

Absolutely! That's a great point. The --input-tar feature is indeed very helpful for Podman users.

Since Podman uses podman save to export images in the same tar format as docker save, users can now:

# Export image with Podman podman save myimage:latest -o image.tar # Squash with docker-squash (no Docker daemon required) docker-squash --input-tar image.tar --tag myimage:squashed --output-path squashed.tar # Import back to Podman podman load -i squashed.tar

This workflow is particularly valuable in environments where:

Only Podman is available (no Docker daemon)

Running in CI/CD pipelines with Podman

Working in rootless containers or restricted environments

Processing images offline without any container runtime

Should I add a Podman example to the documentation to highlight this use case?

docker_squash/tar_image.py

rnc · 2025-07-14T10:24:08Z

docker_squash/tar_image.py

+            self.log.info("Detected Docker format image")
+            self.oci_format = False
+        else:
+            raise SquashError("Unable to detect image format - missing manifest files")


Is this duplicating v2_image::_get_manifest ?

You're absolutely right! There is indeed duplication with v2_image::_get_manifest. Both methods:

Check for index.json to detect OCI format

Set self.oci_format = True/False

Handle manifest file reading

I should refactor this to reuse the existing logic. A few options:

Option 1: Extract common logic to base class

# In Image base class def detect_image_format(self): if os.path.exists(os.path.join(self.old_image_dir, "index.json")): self.oci_format = True return "oci" elif os.path.exists(os.path.join(self.old_image_dir, "manifest.json")): self.oci_format = False return "docker" else: raise SquashError("Unable to detect image format")

Option 2: Have TarImage reuse v2_image's get_manifest

# In TarImage def detect_image_format(self): try: # This will set self.oci_format as a side effect self.manifest = self.get_manifest() # Inherit from v2_image logic except SquashError: raise SquashError("Unable to detect image format")

I lean toward Option 1 as it's cleaner separation of concerns. What do you think?

rnc · 2025-07-14T10:43:45Z

docker_squash/cli.py

+            self.log.info(
+                "💡 Tip: Consider using --tag to specify a name for your squashed image"
+            )
+            self.log.info("   Example: --tag myimage:squashed")


Does a tag make sense for an output tar? It is probably of only relevance if --load-image has been specified?

I respectfully disagree with this assessment. The --tag parameter is meaningful for output tar files regardless of the --load-image setting, here's why:

Tag is part of image metadata in tar format:

Docker/Podman tar format stores tags in manifest.json under RepoTags field

This metadata becomes part of the squashed tar file

Tag is useful in all scenarios:

--load-image true: Image gets loaded with the specified tag

--load-image false + --output-path: The output tar contains tag metadata, so when someone later runs docker load -i squashed.tar, the image will have the proper tag

Distribution: Tagged tar files are more useful when shared with others

Without --tag, the consequences are significant:

# Without tag - image loads but has no name $ docker load -i squashed.tar Loaded image ID: sha256:abc123... $ docker images REPOSITORY TAG IMAGE ID <none> <none> sha256:abc123... # Hard to identify! # With tag - much more usable $ docker load -i squashed.tar Loaded image: myapp:squashed $ docker images REPOSITORY TAG IMAGE ID myapp squashed sha256:abc123... # Clear identification

The tip message encourages good practices for tar-based workflows, not just --load-image scenarios. The tag becomes part of the portable tar artifact.

docker_squash/tar_image.py

rnc · 2025-08-05T08:49:05Z

@lyon-v Did you wish to discuss any of the comments?

vulyon · 2025-08-20T02:57:06Z

sir, my apologies for the slow response. I've been swamped with work lately, but I'll reply to or fix these issues shortly.

support image tar

f04a699

Fix: Apply code formatting

c5feb62

Merge branch 'main' into support-image-tar

de60f37

rnc requested changes Jul 14, 2025

View reviewed changes

wuliang added 2 commits August 20, 2025 08:08

fix code

ff5a571

fix imported but unused

f6e7b56

vulyon requested a review from rnc August 20, 2025 08:22

fix manifest & readme

465fcba

vulyon closed this by deleting the head repository Nov 20, 2025


		::

		$ python -m docker_squash.cli --input-tar source.tar --tag jboss/wildfly:squashed -f 8 --output-path squashed.tar --load-image false

Conversation

vulyon commented Jul 4, 2025

Enable Docker Daemon-Free Image Squashing

Key Benefits

How to Use

Uh oh!

vulyon commented Jul 4, 2025

Uh oh!

rnc commented Jul 4, 2025

Uh oh!

vulyon commented Jul 6, 2025

Uh oh!

vulyon commented Jul 7, 2025

Uh oh!

rnc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vulyon Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rnc commented Aug 5, 2025

Uh oh!

vulyon commented Aug 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulyon Aug 20, 2025 •

edited

Loading