Skip to content

Makefile: add link-time optimization support#420

Open
pjonsson wants to merge 2 commits intodavidfrantz:developfrom
pjonsson:docker-enable-lto
Open

Makefile: add link-time optimization support#420
pjonsson wants to merge 2 commits intodavidfrantz:developfrom
pjonsson:docker-enable-lto

Conversation

@pjonsson
Copy link
Copy Markdown
Contributor

Add flags for enabling link-time
optmization during compilation
by calling "make LTO=yes" and use
this in the Dockerfile.

I cannot find a definition of
LDFLAGS, so just chuck it into
GDAL_FLAGS which should be used
for the binaries that does heavy
computation.

I don't have any performance
comparisons, but compiling with LTO
makes the binary noticeably smaller,
before:

$ ls -l which force-l2ps
-rwxr-xr-x 1 root root 2668408 Mar 16 22:02 /usr/local/bin/force/force-l2ps

and after:

$ ls -l which force-l2ps
-rwxr-xr-x 1 root root 1986848 Mar 16 22:14 /usr/local/bin/force/force-l2ps

Reducing the binary size is likely
to improve the instruction cache
hit rate even if we pessimistically
assume that LTO didn't manage to make
any other optimization.

Add flags for enabling link-time
optmization during compilation
by calling "make LTO=yes".

I cannot find a definition of
LDFLAGS, so just chuck it into
GDAL_FLAGS which should be used
for the binaries that does heavy
computation.

I don't have any performance
comparisons, but compiling with LTO
makes the binary noticeably smaller,
before:

$ ls -l `which force-l2ps`
-rwxr-xr-x 1 root root 2668408 Mar 16 22:02 /usr/local/bin/force/force-l2ps

and after:

$ ls -l `which force-l2ps`
-rwxr-xr-x 1 root root 1986848 Mar 16 22:14 /usr/local/bin/force/force-l2ps

Reducing the binary size is likely
to improve the instruction cache
hit rate even if we pessimistically
assume that LTO didn't manage to make
any other optimization.
@pjonsson
Copy link
Copy Markdown
Contributor Author

The sample size is 1, but here are some rough measurements of CPU-minutes required for creating a L2A product from a Sentinel-2 L1C product with force-l2ps on a 16-core AMD Ryzen 9950X3D:

Compilation flags CPU Usage
-O3 42:48
-O3 -flto 42:04
-O2 -flto 40:28

It's my desktop machine so there are web browsers and other things running in the background at the same time, but the window focused on by my mouse cursor was the FORCE container.

@davidfrantz
Copy link
Copy Markdown
Owner

This seems interesting. I will run some tests before merging, though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants