Skip to content

Some builds are too hungry for login nodes #362

@dougiesquire

Description

@dougiesquire

We've recently been experiencing builds failing due to using too much resource (I think memory) for the Gadi login node.

For example, see this ACCESS-OM3 prerelease which has particularly high memory usage due to lto flags. This failed with

ifx: error #10106: Fatal error in /apps/intel-tools/.packages/2025.2.0.575/compiler/2025.2/bin/compiler/ld.lld, terminated by kill signal

while building access3. I can reproduce the failure using my own spack instance on a login node, but can build successfully on a Sapphire-Rapids compute node (104 cpus, 496G mem).

Some options have been floated on Zulip for preventing these failures:

  • Have the option to run particularly meaty builds on compute nodes. One complication of this is that Gadi compute nodes do not have network access, so sources would need to be mirrored on the login node first. I've typically done this in the past with something like:

    [login-node] $ spack env activate <env>
    [login-node] $ spack concretize -f --fresh
    [login-node] $ spack mirror create -d sources -a
    
    <Start interactive job>
    
    [compute-node] $ spack env activate <env>
    [compute-node] $ spack install
    
  • From @aidanheerdegen

    We could ask NCI if they could increase the allowed memory for the service user ... it would be a lot simpler.

  • From @aidanheerdegen

    We might be able to have a simpler qsub version of a build by using a persistent session tunnel to access the source code, so making it more transparent to spack.

  • From Angus Gibson

    As far as I can tell there's a 4GB limit of resident memory (I'm not sure if per session or per process), and after that it'll start spilling into swap. There's only 16GB swap and that's shared across all users of the node. On e.g. gadi-login-06 it's currently almost all used:

    $ free -h
                  total        used        free      shared  buff/cache   available
    Mem:          250Gi       108Gi        36Gi        13Gi       104Gi       126Gi
    Swap:          15Gi        15Gi       245M
    

    In fact, my test got killed after allocating only 2.8GB there...But on gadi-login-03 there's about 11Gi free and I could get a lot more. So a lot of non-determinism around where the builder lands, particularly if LTO is a bit hungry (or just a complex compile...)
    ...
    Seems to dip into swap at around 4GB combined, it's probably actually per-user because the cgroup is /sys/fs/cgroup/memory/user.slice/user-$(id).slice/memory.limit_in_bytes (= 4294967296)

    We could potentially try to target nodes that aren't under heavy load?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions