Skip to content

MGMT-22546: Fix TNA and TNF dummy ip for ipv6#369

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
giladravid16:tna_ipv6
Jan 9, 2026
Merged

MGMT-22546: Fix TNA and TNF dummy ip for ipv6#369
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
giladravid16:tna_ipv6

Conversation

@giladravid16
Copy link
Copy Markdown
Contributor

@giladravid16 giladravid16 commented Sep 8, 2025

A bit of backround:
When installing TNA/TNF clusters using assisted service, one of the master nodes acts as the bootstrap.
So during the installation there will only be one master node, but we need two in order to configure keepalived.
We cannot wait until the bootstrap finishes and becomes a master, because then no node will have the API vip.
To circumvent that we temporarily add a dummy ip to the list of nodes.
After the bootstrap becomes a master node, it's ip replaces the dummy ip in the list.

What does this PR do:
Right now the dummy ip is always 0.0.0.0, but that doesn't work for clusters that are using ipv6.
This PR fixes that so that if the vip is an ipv4 address then the dummy ip will be 0.0.0.0, but if not the dummy ip will be ::

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@giladravid16: This pull request explicitly references no jira issue.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 8, 2025
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 8, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Sep 8, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-bot
Copy link
Copy Markdown
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 7, 2025
@giladravid16 giladravid16 changed the title NO-ISSUE: Fix TNA and TNF dummy ip for ipv6 MGMT-22546: Fix TNA and TNF dummy ip for ipv6 Dec 24, 2025
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Dec 24, 2025

@giladravid16: This pull request references MGMT-22546 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Dec 24, 2025

@giladravid16: This pull request references MGMT-22546 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

A bit of backround:
When installing TNA/TNF clusters using assisted service, one of the master nodes acts as the bootstrap.
So during the installation there will only be one master node, but we need two in order to configure keepalived.
We cannot wait until the bootstrap finishes and becomes a master, because then no node will have the API vip.
To circumvent that we temporarily add a dummy ip to the list of nodes.
After the bootstrap becomes a master node, it's ip replaces the dummy ip in the list.

What does this PR do:
Right now the dummy ip is always 0.0.0.0, but that doesn't work for clusters that are using ipv6.
This PR fixes that so that if the vip is an ipv4 address then the dummy ip will be 0.0.0.0, but if not the dummy ip will be ::

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@giladravid16 giladravid16 marked this pull request as ready for review December 24, 2025 13:31
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 24, 2025
@openshift-ci openshift-ci Bot requested review from emy and mkowalski December 24, 2025 13:32
@giladravid16
Copy link
Copy Markdown
Contributor Author

After this is merged, can it also be backported all the way to 4.20?

@mkowalski
Copy link
Copy Markdown
Contributor

@giladravid16 yes it can but you are responsible for a Jira hygiene. You need a bug opened with Target Version field 4.22.0; only then we can go with backport

@mkowalski
Copy link
Copy Markdown
Contributor

/test ?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: The following commands are available to trigger required jobs:

/test e2e-metal-ipi-ovn-ipv6
/test gofmt
/test govet
/test images
/test okd-scos-images
/test security
/test unit
/test verify-deps

The following commands are available to trigger optional jobs:

/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-ipv4
/test e2e-openstack
/test okd-scos-e2e-aws-ovn

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-baremetal-runtimecfg-main-e2e-metal-ipi-ovn-ipv6
pull-ci-openshift-baremetal-runtimecfg-main-gofmt
pull-ci-openshift-baremetal-runtimecfg-main-govet
pull-ci-openshift-baremetal-runtimecfg-main-images
pull-ci-openshift-baremetal-runtimecfg-main-okd-scos-images
pull-ci-openshift-baremetal-runtimecfg-main-security
pull-ci-openshift-baremetal-runtimecfg-main-unit
pull-ci-openshift-baremetal-runtimecfg-main-verify-deps
Details

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mkowalski
Copy link
Copy Markdown
Contributor

/payload ?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@mkowalski
Copy link
Copy Markdown
Contributor

/payload periodic-ci-openshift-release-master-nightly-4.22-e2e-agent-ovn-two-node-arbiter-dualstack

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@mkowalski
Copy link
Copy Markdown
Contributor

/payload-job periodic-ci-openshift-release-master-nightly-4.22-e2e-agent-ovn-two-node-arbiter-dualstack

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.22-e2e-agent-ovn-two-node-arbiter-dualstack

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e8b9efc0-ebab-11f0-816d-1074d99e701d-0

@mkowalski
Copy link
Copy Markdown
Contributor

/payload-job periodic-ci-openshift-release-master-nightly-4.22-e2e-metal-ipi-ovn-dualstack-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.22-e2e-metal-ipi-ovn-dualstack-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/08c0f110-ebac-11f0-830b-52dbe5cfa958-0

@mkowalski
Copy link
Copy Markdown
Contributor

/approve
/lgtm

/hold
Waiting for payload jobs to succeed

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 7, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.22-e2e-agent-ovn-two-node-arbiter-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5a664740-ebac-11f0-8322-31ccbdeeac4f-0

@mkowalski
Copy link
Copy Markdown
Contributor

/lgtm cancel

@giladravid16, even with your patch the e2e-agent-ovn-two-node-arbiter-ipv6 failed. Please look at https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-baremetal-runtimecfg-369-nightly-4.22-e2e-agent-ovn-two-node-arbiter-ipv6/2008835257893130240 and figure out what went wrong.

Did you manually test this patch and it worked? Or is it just an attempt to fix?

@openshift-ci openshift-ci Bot removed lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 7, 2026
@giladravid16
Copy link
Copy Markdown
Contributor Author

@mkowalski I tested it with Assisted's CI in openshift/release#72884. I used a custom release image of OCP 4.20 and this PR.

@mkowalski
Copy link
Copy Markdown
Contributor

I tested it with Assisted's CI in openshift/release#72884

Do you mean the ci/rehearse/openshift/assisted-service/master/edge-e2e-metal-assisted-kube-api-tna-4-19 test or some other? If some other, can I please get a link to the passing Prow job? I am trying to see something that was IPv6 and succeeded.

/payload-job periodic-ci-openshift-release-master-nightly-4.22-e2e-agent-ovn-two-node-arbiter-ipv6

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.22-e2e-agent-ovn-two-node-arbiter-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/56d17600-ebd3-11f0-8308-6f803e4d321f-0

@mkowalski
Copy link
Copy Markdown
Contributor

/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-agent-ovn-two-node-arbiter-ipv6

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@mkowalski: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.21-e2e-agent-ovn-two-node-arbiter-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/76af1d60-ebd3-11f0-9c0c-89ae09dab6f5-0

@giladravid16
Copy link
Copy Markdown
Contributor Author

@mkowalski yes, that's the job.
The e2e-agent-ovn-two-node-arbiter-ipv6 job failed before the installation started - it failed during preparing-for-installation.
The reason is that the arbiter node was unable to pull an image, even though the masters were able to.
I'm pretty sure the issue is that the arbiter node doesn't have enough ram, during this phase the each host's filesystem should be half of its ram.
The arbiter has 8GB of ram, so it's filesystem is 4GB, and the image it fails to pull is 2GB.

@giladravid16
Copy link
Copy Markdown
Contributor Author

The job ci/rehearse/openshift/assisted-service/master/edge-e2e-metal-assisted-kube-api-tna-4-19 installs 2 clusters - one uses ipv4 and the other ipv6.
The files that belong to the ipv6 cluster have assisted-spoke-cluster-f62795d5 in their names.
For example here's a node in the cluster, and the agent cluster install (where you can see the vips).

And as I said in my previous comment, the jobs you ran fail before the installation starts.
You can see it in the job's logs - the arbiter can't pull quay-proxy.ci.openshift.org/openshift/ci@sha256:aea3543b56f95f21fd574aff73c2ae7baffca24a77a7f75c26617be2e424a678 and I think it's because it doesn't have enough space for it.
You can compare it to the periodic job's logs where the installation does start, but gets stuck on waiting-for-bootkube.

@mkowalski
Copy link
Copy Markdown
Contributor

/approved
/lgtm
/verified by @giladravid16

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jan 8, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@mkowalski: This PR has been marked as verified by @giladravid16.

Details

In response to this:

/approved
/lgtm
/verified by @giladravid16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mkowalski
Copy link
Copy Markdown
Contributor

/hold cancel

@openshift-ci openshift-ci Bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jan 8, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giladravid16, mkowalski

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 6894aa7 and 2 for PR HEAD 11ead2f in total

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 159b0dd and 1 for PR HEAD 11ead2f in total

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 9, 2026

@giladravid16: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit e5f1bad into openshift:main Jan 9, 2026
9 checks passed
@giladravid16
Copy link
Copy Markdown
Contributor Author

/cherry-pick release-4.21

@openshift-cherrypick-robot
Copy link
Copy Markdown

@giladravid16: new pull request created: #378

Details

In response to this:

/cherry-pick release-4.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@giladravid16
Copy link
Copy Markdown
Contributor Author

/cherry-pick release-4.20

@openshift-cherrypick-robot
Copy link
Copy Markdown

@giladravid16: new pull request created: #379

Details

In response to this:

/cherry-pick release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants