Skip to content

Introduce v1alpha2 API with string-based duration fields and conversion#2136

Draft
Joeavaikath wants to merge 10 commits intoopenshift:oadp-devfrom
Joeavaikath:oadp-3471
Draft

Introduce v1alpha2 API with string-based duration fields and conversion#2136
Joeavaikath wants to merge 10 commits intoopenshift:oadp-devfrom
Joeavaikath:oadp-3471

Conversation

@Joeavaikath
Copy link
Copy Markdown
Contributor

Why the changes were made

Fixes oadp-3471

How to test the changes made

Should be able to create a DPA with the new API version.
This allows the user to specify the timout in the string format (metav1.Duration), and the conversion tech helps translate it to the existing version avoiding issues.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Excluded labels (none allowed) (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e55f7e32-14f4-4817-b3aa-0c540afc8d98

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Adds a new v1alpha2 DataProtectionApplication API (types, deepcopy, scheme), bidirectional conversion with v1alpha1 (hub) including duration-type translation, conversion webhook registration and manifests, manager/webhook wiring, unit tests, design docs, and kustomize/OLM bundle updates.

Changes

Cohort / File(s) Summary
v1alpha2 API & deepcopy
api/v1alpha2/dataprotectionapplication_types.go, api/v1alpha2/groupversion_info.go, api/v1alpha2/zz_generated.deepcopy.go
Adds full v1alpha2 CRD types, enums/constants, helper methods (BackupImages, GetDisableInformerCache, AutoCorrect), scheme registration, and autogenerated deepcopy implementations.
v1alpha2 Conversion Implementation & Tests
api/v1alpha2/dpa_conversion.go, api/v1alpha2/dpa_conversion_test.go
Implements ConvertTo/ConvertFrom between v1alpha2↔v1alpha1 with field-by-field translation (including *metav1.Duration ↔ *time.Duration conversions) and comprehensive round-trip unit tests covering edge cases.
v1alpha2 Webhook & webhook wiring
api/v1alpha2/dpa_webhook.go, cmd/main.go
Adds SetupWebhookWithManager for v1alpha2 and wires webhook setup/registration into operator main, registers v1alpha2 scheme, and adjusts namespace handling for operator/webhook setup.
v1alpha1 Hub marker
api/v1alpha1/dataprotectionapplication_types.go, api/v1alpha1/dpa_conversion.go
Marks v1alpha1 as storage/hub (//+kubebuilder:storageversion) and adds Hub() marker method for conversion.
CRD conversion webhook kustomize & webhook manifests
config/crd/kustomization.yaml, config/crd/patches/webhook_in_dataprotectionapplications.yaml, config/webhook/kustomization.yaml, config/webhook/service.yaml, config/default/kustomization.yaml, config/default/manager_webhook_patch.yaml
Enables CRD conversion webhook patches, adds webhook Service and manager deployment patch exposing port 9443 and serving cert mount, and includes webhook resources in default kustomization.
OLM / bundle updates
bundle/manifests/oadp-operator.clusterserviceversion.yaml, bundle/manifests/openshift-adp-webhook-service_v1_service.yaml, config/manifests/bases/oadp-operator.clusterserviceversion.yaml
Adds v1alpha2 to CSV owned CRDs, adds ConversionWebhook definitions, updates manager container ports/volumes, adds webhook Service manifest, and flips install-mode support (OwnNamespace↔AllNamespaces).
Project config & Makefile
PROJECT, Makefile
Declares v1alpha2 API in PROJECT with conversion webhook enabled; simplifies OperatorGroup YAML in Makefile.
Design & docs
docs/design/oadp-3471-args-duration-type_design.md, docs/design/allnamespaces-install-mode_design.md
Adds design documents describing duration-type change, conversion webhook approach, and AllNamespaces install-mode rationale/implications.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from kaovilai and mrnold March 26, 2026 18:16
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Joeavaikath
Once this PR has been reviewed and has the lgtm label, please assign rayfordj for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Joeavaikath Joeavaikath marked this pull request as draft March 26, 2026 18:24
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2026
@Joeavaikath
Copy link
Copy Markdown
Contributor Author

/hold this approach fails because:

  • Conversion webhooks are cluster-scoped — because CRDs are cluster-scoped, the API server calls the webhook for any namespace
    • OLM requires AllNamespaces install mode when the CSV declares conversionCRDs
    • OADP only supports OwnNamespace — its RBAC, NonAdmin feature, and controller logic all assume single-namespace operation

Since only the main namespace is being watched, maybe changing this isn't the worst thing, but I'm guessing we need discussion around this

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 26, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
api/v1alpha2/dpa_conversion.go (1)

64-310: Keep the handwritten conversion logic as small as the actual API delta.

This file now mirrors almost the entire DPA schema even though the version bump only changes the 10 ServerFlags duration fields. That creates a large drift surface where the next spec/status addition can be missed in one direction and silently disappear at runtime. Prefer generated/shared conversion for the structurally identical fields and keep the custom logic around the duration overrides only.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

Also applies to: 312-803

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1alpha2/dpa_conversion.go` around lines 64 - 310, The conversion
functions (convertSpecToHub, convertSpecFromHub, convertAppConfigToHub,
convertAppConfigFromHub) currently reimplement nearly the entire DPA schema
instead of delegating structurally-identical fields to the generated/shared
conversion helpers; reduce the handwritten logic to only the real delta (the
ServerFlags duration fields) by: invoke the corresponding autogenerated
conversion functions (e.g.
Convert_v1alpha2_DataProtectionApplicationSpec_To_v1alpha1_DataProtectionApplicationSpec
and the reverse) at the top of convertSpecToHub/convertSpecFromHub and for
AppConfig use the generated ApplicationConfig converters, then apply only the
custom overrides for the ServerFlags duration fields (update those specific
fields on dst after the generated conversion); remove the duplicated
field-by-field copying in the listed functions so only the duration overrides
remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/v1alpha2/dataprotectionapplication_types.go`:
- Around line 997-1003: Guard accesses to the nested pointer fields before
dereferencing: update DataProtectionApplication.GetDisableInformerCache and
AutoCorrect to first nil-check dpa.Spec, dpa.Spec.Configuration and
dpa.Spec.Configuration.Velero (ApplicationConfig.Velero) and return safe
defaults (e.g., false for DisableInformerCache) or initialize
dpa.Spec.Configuration.Velero as needed before mutating or reading its fields;
specifically, in GetDisableInformerCache() return false if any of dpa.Spec,
dpa.Spec.Configuration, or dpa.Spec.Configuration.Velero is nil, and in
AutoCorrect() ensure you nil-check and create/assign a Velero struct when you
plan to access/modify its fields so lines that reference
dpa.Spec.Configuration.Velero do not panic.
- Around line 142-145: Update the field documentation for all metav1.Duration
fields (e.g., DefaultBackupTTL) to use only Go duration units (ns, us, ms, s, m,
h) — replace invalid examples like "30d" with an equivalent valid duration such
as "720h" — and ensure examples that disable behavior by specifying zero are
shown as a quoted string or a valid duration (e.g., "0" or "0s") so YAML does
not treat them as integers; apply these changes to DefaultBackupTTL and the
other Duration fields referenced in the comments.

In `@config/default/kustomization.yaml`:
- Around line 21-23: The default kustomization currently enables the conversion
webhook without cert provisioning; update config/default/kustomization.yaml to
either remove the webhook component lines (the two occurrences toggling
../webhook) or enable cert-manager/CA injection in this overlay so the webhook
certs are provisioned; ensure the change aligns with the related patches that
reference webhook-server-cert in manager_webhook_patch.yaml (the mount that
expects the secret) and the CRD conversion patch in
config/crd/patches/webhook_in_dataprotectionapplications.yaml (which switches
the CRD to webhook conversion) so the manager pod can start and TLS for
conversion is satisfied.

---

Nitpick comments:
In `@api/v1alpha2/dpa_conversion.go`:
- Around line 64-310: The conversion functions (convertSpecToHub,
convertSpecFromHub, convertAppConfigToHub, convertAppConfigFromHub) currently
reimplement nearly the entire DPA schema instead of delegating
structurally-identical fields to the generated/shared conversion helpers; reduce
the handwritten logic to only the real delta (the ServerFlags duration fields)
by: invoke the corresponding autogenerated conversion functions (e.g.
Convert_v1alpha2_DataProtectionApplicationSpec_To_v1alpha1_DataProtectionApplicationSpec
and the reverse) at the top of convertSpecToHub/convertSpecFromHub and for
AppConfig use the generated ApplicationConfig converters, then apply only the
custom overrides for the ServerFlags duration fields (update those specific
fields on dst after the generated conversion); remove the duplicated
field-by-field copying in the listed functions so only the duration overrides
remain.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fd36f273-f1ad-4442-b79d-efe475462d0b

📥 Commits

Reviewing files that changed from the base of the PR and between f2f8c5e and fbde3f4.

📒 Files selected for processing (18)
  • PROJECT
  • api/v1alpha1/dataprotectionapplication_types.go
  • api/v1alpha1/dpa_conversion.go
  • api/v1alpha2/dataprotectionapplication_types.go
  • api/v1alpha2/dpa_conversion.go
  • api/v1alpha2/dpa_conversion_test.go
  • api/v1alpha2/dpa_webhook.go
  • api/v1alpha2/groupversion_info.go
  • api/v1alpha2/zz_generated.deepcopy.go
  • cmd/main.go
  • config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml
  • config/crd/kustomization.yaml
  • config/crd/patches/webhook_in_dataprotectionapplications.yaml
  • config/default/kustomization.yaml
  • config/default/manager_webhook_patch.yaml
  • config/webhook/kustomization.yaml
  • config/webhook/service.yaml
  • docs/design/oadp-3471-args-duration-type_design.md

Comment on lines +142 to +145
// How long to wait by default before backups can be garbage collected. Specify as a Go duration string (e.g. "720h", "30d"). (default is 720 hours)
// +optional
DefaultBackupTTL *metav1.Duration `json:"default-backup-ttl,omitempty"`
// How often to verify if the storage is valid. Optional. Set this to "0" to disable sync. Specify as a Go duration string (e.g. "1m", "5m"). (default is 1 minute)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check the actual file content at the mentioned lines
cd /root && find . -name "dataprotectionapplication_types.go" -path "*/v1alpha2/*" | head -1

Repository: openshift/oadp-operator

Length of output: 122


🏁 Script executed:

# Read the file to verify the actual lines mentioned
head -n 150 api/v1alpha2/dataprotectionapplication_types.go | tail -n 20

Repository: openshift/oadp-operator

Length of output: 2718


🏁 Script executed:

# Check lines around 641, 723, 761
wc -l api/v1alpha2/dataprotectionapplication_types.go

Repository: openshift/oadp-operator

Length of output: 119


🏁 Script executed:

# Get the specific line ranges mentioned in the review
sed -n '142,145p' api/v1alpha2/dataprotectionapplication_types.go
sed -n '641,645p' api/v1alpha2/dataprotectionapplication_types.go
sed -n '723,731p' api/v1alpha2/dataprotectionapplication_types.go
sed -n '761,771p' api/v1alpha2/dataprotectionapplication_types.go

Repository: openshift/oadp-operator

Length of output: 1695


🏁 Script executed:

# Verify Go's time.ParseDuration behavior with "30d"
python3 << 'EOF'
# Go's time.ParseDuration does NOT accept "30d" format
# It only accepts: ns, us, ms, s, m, h (nanoseconds through hours)
# "30d" would cause: "time: invalid duration \"30d\""
# Valid formats: "30h", "720h", "0s", "1m", "5m", etc.

# Let's verify the Go documentation
valid_units = ["ns", "us", "ms", "s", "m", "h"]
test_cases = [
    ("720h", True),   # 720 hours - valid
    ("30d", False),   # days - NOT valid
    ("1m", True),     # 1 minute - valid
    ("5m", True),     # 5 minutes - valid
    ("0s", True),     # 0 seconds - valid
    ("0", False),     # bare 0 without unit - NOT valid as Go duration
]

print("Go time.ParseDuration accepted units: ", valid_units)
print("\nTest case validities:")
for duration_str, is_valid in test_cases:
    print(f"  {duration_str:10} -> {is_valid}")
EOF

Repository: openshift/oadp-operator

Length of output: 290


🏁 Script executed:

# Let's also verify YAML parsing behavior
python3 << 'EOF'
import yaml

# Test how YAML parses bare 0 vs "0"
yaml_tests = [
    'duration: 0',        # bare 0 -> parsed as int
    'duration: "0"',      # quoted "0" -> parsed as string
    'duration: "0s"',     # quoted "0s" -> parsed as string
]

print("YAML parsing behavior:\n")
for yaml_str in yaml_tests:
    parsed = yaml.safe_load(yaml_str)
    duration_val = parsed['duration']
    print(f"  Input:  {yaml_str:20} -> Type: {type(duration_val).__name__:5} Value: {repr(duration_val)}")
EOF

Repository: openshift/oadp-operator

Length of output: 260


Fix invalid metav1.Duration examples in field documentation

Go's time.ParseDuration only accepts units: ns, us, ms, s, m, h. The line 142 example "30d" (days) is invalid and will be rejected at runtime. Additionally, lines 641, 723, and 761 document disabling via bare 0, which YAML parses as an integer—users need a quoted string like "0" or "0s" for the API to accept it. These comments populate CRD documentation, so invalid examples will direct users toward manifests the API rejects.

📝 Suggested doc fixes
- // How long to wait by default before backups can be garbage collected. Specify as a Go duration string (e.g. "720h", "30d"). (default is 720 hours)
+ // How long to wait by default before backups can be garbage collected. Specify as a Go duration string (e.g. "720h"). (default is 720 hours)

- // backupSyncPeriod defines how frequently to sync backup API objects from object storage. A value of 0 disables sync.
+ // backupSyncPeriod defines how frequently to sync backup API objects from object storage. A value of "0" (or "0s") disables sync.

- // ValidationFrequency defines how frequently to validate the corresponding object storage. A value of 0 disables validation.
+ // ValidationFrequency defines how frequently to validate the corresponding object storage. A value of "0" (or "0s") disables validation.

- // GarbageCollectionPeriod defines how frequently to look for possible leftover non admin related objects in OADP namespace.
- // A value of 0 disables garbage collection.
+ // GarbageCollectionPeriod defines how frequently to look for possible leftover non admin related objects in OADP namespace.
+ // A value of "0" (or "0s") disables garbage collection.

- // BackupSyncPeriod specifies the interval at which backups from the OADP namespace are synchronized with non-admin namespaces.
- // A value of 0 disables sync.
+ // BackupSyncPeriod specifies the interval at which backups from the OADP namespace are synchronized with non-admin namespaces.
+ // A value of "0" (or "0s") disables sync.

Also applies to: 641-645, 723-731, 761-771

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1alpha2/dataprotectionapplication_types.go` around lines 142 - 145,
Update the field documentation for all metav1.Duration fields (e.g.,
DefaultBackupTTL) to use only Go duration units (ns, us, ms, s, m, h) — replace
invalid examples like "30d" with an equivalent valid duration such as "720h" —
and ensure examples that disable behavior by specifying zero are shown as a
quoted string or a valid duration (e.g., "0" or "0s") so YAML does not treat
them as integers; apply these changes to DefaultBackupTTL and the other Duration
fields referenced in the comments.

Comment on lines +997 to +1003
// Default DisableInformerCache behavior when nil to false
func (dpa *DataProtectionApplication) GetDisableInformerCache() bool {
if dpa.Spec.Configuration.Velero.DisableInformerCache == nil {
return false
}
return *dpa.Spec.Configuration.Velero.DisableInformerCache
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard spec.configuration.velero before using these helpers

GetDisableInformerCache() and AutoCorrect() both dereference dpa.Spec.Configuration.Velero unconditionally. The check at Line 1032 is too late because Lines 1019, 1022, 1028, 1045, 1046, and 1051 already assume the nested pointers exist. Since ApplicationConfig.Velero is a pointer field, a partially populated object will panic here instead of defaulting safely.

🛡️ Suggested nil-guarding
 func (dpa *DataProtectionApplication) GetDisableInformerCache() bool {
-	if dpa.Spec.Configuration.Velero.DisableInformerCache == nil {
+	if dpa.Spec.Configuration == nil || dpa.Spec.Configuration.Velero == nil ||
+		dpa.Spec.Configuration.Velero.DisableInformerCache == nil {
 		return false
 	}
 	return *dpa.Spec.Configuration.Velero.DisableInformerCache
 }
 
 func (dpa *DataProtectionApplication) AutoCorrect() {
+	if dpa.Spec.Configuration == nil {
+		return
+	}
+
+	veleroConfig := dpa.Spec.Configuration.Velero
+
 	//check if CSI plugin is added in spec
-	if hasCSIPlugin(dpa.Spec.Configuration.Velero.DefaultPlugins) {
-		dpa.Spec.Configuration.Velero.FeatureFlags = append(dpa.Spec.Configuration.Velero.FeatureFlags, velero.CSIFeatureFlag)
+	if veleroConfig != nil && hasCSIPlugin(veleroConfig.DefaultPlugins) {
+		veleroConfig.FeatureFlags = append(veleroConfig.FeatureFlags, velero.CSIFeatureFlag)
 	}
-	if dpa.Spec.Configuration.Velero.RestoreResourcesVersionPriority != "" {
+	if veleroConfig != nil && veleroConfig.RestoreResourcesVersionPriority != "" {
 		// if the RestoreResourcesVersionPriority is specified then ensure feature flag is enabled for enableApiGroupVersions
 		// duplicate feature flag checks are done in ReconcileVeleroDeployment
-		dpa.Spec.Configuration.Velero.FeatureFlags = append(dpa.Spec.Configuration.Velero.FeatureFlags, velero.APIGroupVersionsFeatureFlag)
+		veleroConfig.FeatureFlags = append(veleroConfig.FeatureFlags, velero.APIGroupVersionsFeatureFlag)
 	}
 
-	if dpa.Spec.Configuration.Velero.Args != nil {
+	if veleroConfig != nil && veleroConfig.Args != nil {
 		...
 	}
 
-	dpa.Spec.Configuration.Velero.DefaultPlugins = common.RemoveDuplicateValues(dpa.Spec.Configuration.Velero.DefaultPlugins)
-	dpa.Spec.Configuration.Velero.FeatureFlags = common.RemoveDuplicateValues(dpa.Spec.Configuration.Velero.FeatureFlags)
+	if veleroConfig != nil {
+		veleroConfig.DefaultPlugins = common.RemoveDuplicateValues(veleroConfig.DefaultPlugins)
+		veleroConfig.FeatureFlags = common.RemoveDuplicateValues(veleroConfig.FeatureFlags)
+	}
 
 	if dpa.Spec.Configuration.NodeAgent != nil && dpa.Spec.Configuration.NodeAgent.PodConfig != nil && dpa.Spec.Configuration.NodeAgent.PodConfig.NodeSelector != nil {
 		...
 	}
 }

Also applies to: 1017-1064

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1alpha2/dataprotectionapplication_types.go` around lines 997 - 1003,
Guard accesses to the nested pointer fields before dereferencing: update
DataProtectionApplication.GetDisableInformerCache and AutoCorrect to first
nil-check dpa.Spec, dpa.Spec.Configuration and dpa.Spec.Configuration.Velero
(ApplicationConfig.Velero) and return safe defaults (e.g., false for
DisableInformerCache) or initialize dpa.Spec.Configuration.Velero as needed
before mutating or reading its fields; specifically, in
GetDisableInformerCache() return false if any of dpa.Spec,
dpa.Spec.Configuration, or dpa.Spec.Configuration.Velero is nil, and in
AutoCorrect() ensure you nil-check and create/assign a Velero struct when you
plan to access/modify its fields so lines that reference
dpa.Spec.Configuration.Velero do not panic.

Comment on lines +21 to 23
# [WEBHOOK] Conversion webhook for DPA CRD (v1alpha1 <-> v1alpha2)
- ../webhook
# [CERTMANAGER] To enable cert-manager, uncomment all sections with 'CERTMANAGER'. 'WEBHOOK' components are required.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

config/default now enables a conversion webhook without provisioning certs.

Lines 21-23 and 34-35 turn on the webhook resources, but this overlay still leaves cert-manager and CA injection disabled. Cross-file: config/default/manager_webhook_patch.yaml, Lines 18-23, mounts webhook-server-cert, and config/crd/patches/webhook_in_dataprotectionapplications.yaml, Lines 7-16, switches the CRD to webhook conversion. A config/default deploy can therefore fail to start the manager pod or fail webhook TLS, which breaks v1alpha2 conversion traffic.

Either enable the cert provisioning/CA injection path in this overlay as part of the same change, or keep the webhook wiring out of config/default.

Also applies to: 34-35

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/default/kustomization.yaml` around lines 21 - 23, The default
kustomization currently enables the conversion webhook without cert
provisioning; update config/default/kustomization.yaml to either remove the
webhook component lines (the two occurrences toggling ../webhook) or enable
cert-manager/CA injection in this overlay so the webhook certs are provisioned;
ensure the change aligns with the related patches that reference
webhook-server-cert in manager_webhook_patch.yaml (the mount that expects the
secret) and the CRD conversion patch in
config/crd/patches/webhook_in_dataprotectionapplications.yaml (which switches
the CRD to webhook conversion) so the manager pod can start and TLS for
conversion is satisfied.

@kaovilai
Copy link
Copy Markdown
Member

@weshayutin Reminder on why we need the new v1alpha2 API:

  1. OLMv1 on OCP 5 will require CRD upgrade safety — our CRDs will need to comply with the OLMv1 CRD upgrade safety requirements. Introducing v1alpha2 now lets us evolve the API without violating those constraints.

  2. Fewer breaking changes — by adding a new API version alongside v1alpha1 rather than modifying it in place, existing users are not broken.

  3. Follows Kubernetes CRD versioning best practices — this aligns with the official Kubernetes CRD versioning/upgrade guide. We can also add a migrator in the future when we're ready to drop v1alpha1, making the eventual deprecation path clean and well-defined.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2026
@Joeavaikath Joeavaikath marked this pull request as ready for review March 26, 2026 21:02
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2026
@openshift-ci openshift-ci bot requested a review from sseago March 26, 2026 21:02
Documents the rationale for introducing a v1alpha2 API version to
change ServerFlags duration fields from *time.Duration (integer
nanoseconds) to *metav1.Duration (human-readable strings).

Signed-off-by: Joseph <jvaikath@redhat.com>
Add +kubebuilder:storageversion marker to the DPA type and implement
the empty Hub() method required by the conversion.Hub interface.
This designates v1alpha1 as the version stored in etcd.

Signed-off-by: Joseph <jvaikath@redhat.com>
Introduce api/v1alpha2/ package with DPA types that change the 10
ServerFlags duration fields from *time.Duration to *metav1.Duration.
This allows users to specify values like "2h" or "120m" instead of
nanosecond integers.

Includes ConvertTo/ConvertFrom methods for lossless v1alpha1<->v1alpha2
conversion, webhook registration, and round-trip conversion tests.

Signed-off-by: Joseph <jvaikath@redhat.com>
Add webhook service, CRD conversion patch, and manager TLS volume
mount. Enable webhook sections in crd and default kustomizations.

Signed-off-by: Joseph <jvaikath@redhat.com>
Add v1alpha2 scheme registration, webhook setup in main.go, and
v1alpha2 DPA resource entry in PROJECT file.

Signed-off-by: Joseph <jvaikath@redhat.com>
Output of make generate and make manifests. The DPA CRD now includes
both v1alpha1 (storage, integer durations) and v1alpha2 (string
durations) versions.

Signed-off-by: Joseph <jvaikath@redhat.com>
OLM requires AllNamespaces install mode for conversion webhooks,
but OADP only supports OwnNamespace. This needs team discussion
before the bundle can be generated.

Signed-off-by: Joseph <jvaikath@redhat.com>
OLM requires AllNamespaces as the only supported install mode when
conversionCRDs are present. Change OwnNamespace to false, remove
targetNamespaces from the deploy-olm OperatorGroup, and add
getOperatorNamespace() fallback in main.go so the operator resolves
its own namespace when WATCH_NAMESPACE is empty.

Signed-off-by: Joseph <jvaikath@redhat.com>
Output of make bundle. Includes webhook service, v1alpha2 CRD schema,
and updated CSV with webhookdefinitions.

Signed-off-by: Joseph <jvaikath@redhat.com>
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/main.go (1)

337-342: ⚠️ Potential issue | 🔴 Critical

The second VMDPDownloadSetup can abort manager startup.

In the new AllNamespaces path watchNamespace is empty, so this registers a runnable with Namespace/OperatorNamespace set to "". internal/controller/vmdp_download_controller.go:31-70 immediately does a Get for the operator Deployment in OperatorNamespace, so startup fails before the manager becomes ready.

Suggested fix
-		if err := mgr.Add(&controller.VMDPDownloadSetup{
-			Client:            mgr.GetClient(),
-			Namespace:         watchNamespace,
-			OperatorName:      "openshift-adp-controller-manager",
-			OperatorNamespace: watchNamespace,
-		}); err != nil {
-			setupLog.Error(err, "unable to add VMDP download setup")
-			os.Exit(1)
-		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/main.go` around lines 337 - 342, The second VMDPDownloadSetup
registration (mgr.Add(&controller.VMDPDownloadSetup{...})) uses an empty
watchNamespace for the AllNamespaces path which results in OperatorNamespace==""
and causes the controller (internal/controller/vmdp_download_controller.go) to
fail when it immediately Get()s the operator Deployment; fix by guarding the
registration so you only call mgr.Add for that VMDPDownloadSetup when
watchNamespace is non-empty (or otherwise supply the correct operator
namespace), i.e., check watchNamespace != "" before adding the runnable so
OperatorNamespace is never registered as an empty string.
♻️ Duplicate comments (1)
api/v1alpha2/dataprotectionapplication_types.go (1)

997-1003: ⚠️ Potential issue | 🟠 Major

Guard optional spec.configuration.velero before using these helpers.

Line 999 and the dereferences from Line 1019 onward assume spec.configuration.velero is always set, but ApplicationConfig.Velero is optional in this API. A DPA that only sets nodeAgent or other non-Velero fields will panic here instead of defaulting safely.

Also applies to: 1017-1064

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1alpha2/dataprotectionapplication_types.go` around lines 997 - 1003,
GetDisableInformerCache (and the other helper getters in the 1017–1064 range on
DataProtectionApplication) assume dpa.Spec.Configuration and
dpa.Spec.Configuration.Velero are non-nil and will panic; update these helpers
to nil-check dpa.Spec, dpa.Spec.Configuration and dpa.Spec.Configuration.Velero
before dereferencing and return the appropriate safe default (false or zero
value) when any of those are nil. Specifically, modify GetDisableInformerCache
and the sibling getters to first guard on dpa.Spec != nil &&
dpa.Spec.Configuration != nil && dpa.Spec.Configuration.Velero != nil, then
return the dereferenced value only if present, otherwise return the documented
default. Ensure you update all helper functions in that block (e.g.,
GetDisableInformerCache and the methods covering lines ~1017–1064) consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/manifests/bases/oadp-operator.clusterserviceversion.yaml`:
- Around line 479-486: The CSV is advertising AllNamespaces while the binary
(cmd/main.go) still pins the manager to the operator namespace and behaves like
a single-namespace controller; update the manifest installModes entries so they
accurately reflect supported modes (set type: OwnNamespace and type:
SingleNamespace to supported: true and set type: AllNamespaces and type:
MultiNamespace to supported: false) OR alternatively implement true cluster-wide
behavior in the runtime (remove the manager pinning logic in cmd/main.go and
handle an empty watched-namespace correctly); pick one approach and make the CSV
and code consistent (refer to the install mode YAML keys OwnNamespace,
SingleNamespace, MultiNamespace, AllNamespaces and the manager pinning in
cmd/main.go).

In `@docs/design/oadp-3471-args-duration-type_design.md`:
- Around line 164-165: Update the Compatibility section to mark the install-mode
conflict between OLM-managed webhook lifecycle and operator install-modes as a
release blocker: explicitly change the wording around "OLM" (lines referencing
OLM-managed webhook lifecycle) from neutral/open issue to a gating requirement
and clearly state that the AllNamespaces vs OwnNamespace install-mode conflict
(references to AllNamespaces and OwnNamespace) must be resolved before release;
adjust the text at the paragraphs currently mentioning OLM (around the
webhook/certs sentence) and the later lines that discuss AllNamespaces vs
OwnNamespace so they state this is a release blocker and describe the required
resolution outcome.
- Line 4: Update the Jira reference in the document header so it points to
OADP-3471 instead of OADP-3379: find the line containing "Jira:
[OADP-3379](https://redhat.atlassian.net/browse/OADP-3379)" and replace the
ticket ID and URL to "OADP-3471" (e.g., "Jira:
[OADP-3471](https://redhat.atlassian.net/browse/OADP-3471)") to restore correct
traceability for this design/PR.
- Around line 155-158: Update the Security section in the conversion webhook
description to remove the blanket "No new attack surface" claim and explicitly
state that this webhook is invoked by the API server for a cluster-scoped CRD
(i.e., it may receive requests for all namespaces), enumerate RBAC expectations
(which service account(s) should be allowed to call the webhook and required
roles/permissions), and list request-hardening assumptions to be met (TLS/mutual
TLS or API server client cert validation, admission review request verification,
strict timeouts and rate limiting, input validation/sanitization, and structured
audit logging); reference the terms "conversion webhook", "cluster-scoped CRD",
and "API server" so readers can locate the relevant paragraph and update wording
accordingly.

---

Outside diff comments:
In `@cmd/main.go`:
- Around line 337-342: The second VMDPDownloadSetup registration
(mgr.Add(&controller.VMDPDownloadSetup{...})) uses an empty watchNamespace for
the AllNamespaces path which results in OperatorNamespace=="" and causes the
controller (internal/controller/vmdp_download_controller.go) to fail when it
immediately Get()s the operator Deployment; fix by guarding the registration so
you only call mgr.Add for that VMDPDownloadSetup when watchNamespace is
non-empty (or otherwise supply the correct operator namespace), i.e., check
watchNamespace != "" before adding the runnable so OperatorNamespace is never
registered as an empty string.

---

Duplicate comments:
In `@api/v1alpha2/dataprotectionapplication_types.go`:
- Around line 997-1003: GetDisableInformerCache (and the other helper getters in
the 1017–1064 range on DataProtectionApplication) assume dpa.Spec.Configuration
and dpa.Spec.Configuration.Velero are non-nil and will panic; update these
helpers to nil-check dpa.Spec, dpa.Spec.Configuration and
dpa.Spec.Configuration.Velero before dereferencing and return the appropriate
safe default (false or zero value) when any of those are nil. Specifically,
modify GetDisableInformerCache and the sibling getters to first guard on
dpa.Spec != nil && dpa.Spec.Configuration != nil &&
dpa.Spec.Configuration.Velero != nil, then return the dereferenced value only if
present, otherwise return the documented default. Ensure you update all helper
functions in that block (e.g., GetDisableInformerCache and the methods covering
lines ~1017–1064) consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ba6c8990-fd23-4564-9086-591444f06d43

📥 Commits

Reviewing files that changed from the base of the PR and between fbde3f4 and 8af0c83.

📒 Files selected for processing (23)
  • Makefile
  • PROJECT
  • api/v1alpha1/dataprotectionapplication_types.go
  • api/v1alpha1/dpa_conversion.go
  • api/v1alpha2/dataprotectionapplication_types.go
  • api/v1alpha2/dpa_conversion.go
  • api/v1alpha2/dpa_conversion_test.go
  • api/v1alpha2/dpa_webhook.go
  • api/v1alpha2/groupversion_info.go
  • api/v1alpha2/zz_generated.deepcopy.go
  • bundle/manifests/oadp-operator.clusterserviceversion.yaml
  • bundle/manifests/oadp.openshift.io_dataprotectionapplications.yaml
  • bundle/manifests/openshift-adp-webhook-service_v1_service.yaml
  • cmd/main.go
  • config/crd/bases/oadp.openshift.io_dataprotectionapplications.yaml
  • config/crd/kustomization.yaml
  • config/crd/patches/webhook_in_dataprotectionapplications.yaml
  • config/default/kustomization.yaml
  • config/default/manager_webhook_patch.yaml
  • config/manifests/bases/oadp-operator.clusterserviceversion.yaml
  • config/webhook/kustomization.yaml
  • config/webhook/service.yaml
  • docs/design/oadp-3471-args-duration-type_design.md
✅ Files skipped from review due to trivial changes (8)
  • config/webhook/kustomization.yaml
  • api/v1alpha1/dpa_conversion.go
  • config/crd/kustomization.yaml
  • config/crd/patches/webhook_in_dataprotectionapplications.yaml
  • bundle/manifests/openshift-adp-webhook-service_v1_service.yaml
  • config/webhook/service.yaml
  • api/v1alpha2/groupversion_info.go
  • api/v1alpha2/zz_generated.deepcopy.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • api/v1alpha2/dpa_webhook.go
  • config/default/kustomization.yaml
  • api/v1alpha1/dataprotectionapplication_types.go
  • api/v1alpha2/dpa_conversion.go

Comment on lines +479 to 486
- supported: false
type: OwnNamespace
- supported: false
type: SingleNamespace
- supported: false
type: MultiNamespace
- supported: false
- supported: true
type: AllNamespaces
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The CSV now advertises an install mode the controller still doesn't support.

Disabling OwnNamespace and leaving AllNamespaces as the only supported mode means OLM will drive the operator with an empty watched-namespace annotation, but the runtime still behaves like a single-namespace install (cmd/main.go pins the manager to the operator namespace and still has startup logic that assumes a concrete watched namespace). This bundle metadata now promises a mode the binary does not actually satisfy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/manifests/bases/oadp-operator.clusterserviceversion.yaml` around lines
479 - 486, The CSV is advertising AllNamespaces while the binary (cmd/main.go)
still pins the manager to the operator namespace and behaves like a
single-namespace controller; update the manifest installModes entries so they
accurately reflect supported modes (set type: OwnNamespace and type:
SingleNamespace to supported: true and set type: AllNamespaces and type:
MultiNamespace to supported: false) OR alternatively implement true cluster-wide
behavior in the runtime (remove the manager pinning logic in cmd/main.go and
handle an empty watched-namespace correctly); pick one approach and make the CSV
and code consistent (refer to the install mode YAML keys OwnNamespace,
SingleNamespace, MultiNamespace, AllNamespaces and the manager pinning in
cmd/main.go).

# Velero Server Args: Duration Field Type Fix

Date: 2026-03-26
Jira: [OADP-3379](https://redhat.atlassian.net/browse/OADP-3379)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the Jira reference mismatch for traceability.

Line 4 points to OADP-3379, but this design/PR is for OADP-3471. This will break issue traceability in release notes and audits.

Proposed doc fix
-Jira: [OADP-3379](https://redhat.atlassian.net/browse/OADP-3379)
+Jira: [OADP-3471](https://redhat.atlassian.net/browse/OADP-3471)

As per coding guidelines, "**: Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Jira: [OADP-3379](https://redhat.atlassian.net/browse/OADP-3379)
Jira: [OADP-3471](https://redhat.atlassian.net/browse/OADP-3471)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/oadp-3471-args-duration-type_design.md` at line 4, Update the
Jira reference in the document header so it points to OADP-3471 instead of
OADP-3379: find the line containing "Jira:
[OADP-3379](https://redhat.atlassian.net/browse/OADP-3379)" and replace the
ticket ID and URL to "OADP-3471" (e.g., "Jira:
[OADP-3471](https://redhat.atlassian.net/browse/OADP-3471)") to restore correct
traceability for this design/PR.

Comment on lines +155 to +158
No security impact.
The conversion webhook runs within the operator pod and handles only type conversion of duration values.
No new attack surface is introduced.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Security section understates the webhook risk profile.

The “No new attack surface” claim is inaccurate for a conversion webhook on a cluster-scoped CRD. The API server can call it for requests across namespaces, so the doc should explicitly call out scope, RBAC expectations, and request hardening assumptions.

Proposed doc fix
-## Security Considerations
-
-No security impact.
-The conversion webhook runs within the operator pod and handles only type conversion of duration values.
-No new attack surface is introduced.
+## Security Considerations
+
+This introduces a conversion webhook for a cluster-scoped CRD, so webhook endpoints are part of the API-serving path for conversion requests.
+While conversion logic is type-only (duration translation), the webhook must be treated as security-sensitive control-plane traffic:
+- enforce least-privilege RBAC for webhook/service resources,
+- keep TLS/cert rotation managed and monitored,
+- validate/defensively handle unexpected payloads and conversion failures.
+Operationally, this also intersects with install-mode constraints documented below.

As per coding guidelines, "**: Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
No security impact.
The conversion webhook runs within the operator pod and handles only type conversion of duration values.
No new attack surface is introduced.
## Security Considerations
This introduces a conversion webhook for a cluster-scoped CRD, so webhook endpoints are part of the API-serving path for conversion requests.
While conversion logic is type-only (duration translation), the webhook must be treated as security-sensitive control-plane traffic:
- enforce least-privilege RBAC for webhook/service resources,
- keep TLS/cert rotation managed and monitored,
- validate/defensively handle unexpected payloads and conversion failures.
Operationally, this also intersects with install-mode constraints documented below.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/oadp-3471-args-duration-type_design.md` around lines 155 - 158,
Update the Security section in the conversion webhook description to remove the
blanket "No new attack surface" claim and explicitly state that this webhook is
invoked by the API server for a cluster-scoped CRD (i.e., it may receive
requests for all namespaces), enumerate RBAC expectations (which service
account(s) should be allowed to call the webhook and required
roles/permissions), and list request-hardening assumptions to be met (TLS/mutual
TLS or API server client cert validation, admission review request verification,
strict timeouts and rate limiting, input validation/sanitization, and structured
audit logging); reference the terms "conversion webhook", "cluster-scoped CRD",
and "API server" so readers can locate the relevant paragraph and update wording
accordingly.

Comment on lines +164 to +165
- **OLM**: Webhook lifecycle (certs, registration) is managed by OLM on OpenShift.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Compatibility section should mark install-mode conflict as a release blocker.

Line 164-165 implies OLM handling is straightforward, but Line 178-179 already identifies an AllNamespaces vs OwnNamespace conflict. This should be elevated from “open issue” wording to explicit gating criteria, otherwise the design reads as deployable when it may not be.

Proposed doc fix
-- **OLM**: Webhook lifecycle (certs, registration) is managed by OLM on OpenShift.
+- **OLM**: Webhook lifecycle (certs, registration) is managed by OLM on OpenShift, but conversion CRDs introduce install-mode constraints that must be resolved before release.
@@
-- **OLM install mode conflict**: OLM requires `AllNamespaces` install mode when the CSV declares `conversionCRDs` in `webhookdefinitions`. OADP currently only supports `OwnNamespace`. Switching to `AllNamespaces` doesn't change the operator's runtime behavior (it still watches a single namespace via `watchNamespace`), but it changes what OLM allows users to select at install time. This needs team discussion before proceeding — it may affect how the operator is presented in the OpenShift console and could have implications for multi-tenancy expectations.
+- **Release blocker — OLM install mode conflict**: OLM requires `AllNamespaces` install mode when the CSV declares `conversionCRDs` in `webhookdefinitions`. OADP currently supports `OwnNamespace` assumptions in RBAC/NonAdmin/controller behavior. A concrete resolution and rollout plan (including UX and tenancy impact) is required before shipping v1alpha2 conversion.

As per coding guidelines, "**: Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

Also applies to: 178-179

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/oadp-3471-args-duration-type_design.md` around lines 164 - 165,
Update the Compatibility section to mark the install-mode conflict between
OLM-managed webhook lifecycle and operator install-modes as a release blocker:
explicitly change the wording around "OLM" (lines referencing OLM-managed
webhook lifecycle) from neutral/open issue to a gating requirement and clearly
state that the AllNamespaces vs OwnNamespace install-mode conflict (references
to AllNamespaces and OwnNamespace) must be resolved before release; adjust the
text at the paragraphs currently mentioning OLM (around the webhook/certs
sentence) and the later lines that discuss AllNamespaces vs OwnNamespace so they
state this is a release blocker and describe the required resolution outcome.

@Joeavaikath Joeavaikath marked this pull request as draft March 27, 2026 12:38
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 27, 2026
Documents why conversion webhooks require AllNamespaces mode,
what changes and what doesn't, pros/cons, and alternatives
considered. Intended for team discussion before proceeding.

Signed-off-by: Joseph <jvaikath@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants