Skip to content

ENG-2537: Add property_ids to DatasetConfig and ManualTaskConfig#7361

Merged
JadeCara merged 17 commits intomainfrom
ENG-2537/assign-properties-to-datasets
Feb 13, 2026
Merged

ENG-2537: Add property_ids to DatasetConfig and ManualTaskConfig#7361
JadeCara merged 17 commits intomainfrom
ENG-2537/assign-properties-to-datasets

Conversation

@JadeCara
Copy link
Contributor

@JadeCara JadeCara commented Feb 11, 2026

Ticket ENG-2537

Description Of Changes

Add property_ids ARRAY(String) column to datasetconfig and manual_task_config tables, enabling datasets to be tagged with properties for future property-based DAG filtering (Ticket 2). This is a data model + schema change only — no runtime DSR behavior changes.

Key design decisions:

  • ARRAY(String) column (not join table) — low cardinality, matches existing patterns (Client.scopes, StagedResource.classifications)
  • server_default='{}' — empty array = universal access, backward compatible, no bootstrap needed
  • GIN index on datasetconfig.property_ids for efficient array containment queries
  • DatasetConfigCtlDataset schema accepts optional property_ids for PATCH pass-through
  • Property IDs validated against plus_property table on PATCH — invalid IDs rejected with clear error
  • Existing update paths (e.g. SaaS dataset config updates) do not touch property_ids — only fields in the data dict are set

Note: Companion Fidesplus PR fidesplus#3087 adds bulk-assign/remove API endpoints, property deletion cascade, and tests.

Code Changes

  • src/fides/api/models/datasetconfig.py - Add property_ids ARRAY(String) column + GIN index
  • src/fides/api/models/manual_task/manual_task.py - Add property_ids ARRAY(String) column
  • src/fides/api/schemas/dataset.py - Add optional property_ids to DatasetConfigCtlDataset
  • src/fides/service/dataset/dataset_config_service.py - Pass through property_ids when provided in PATCH, validate IDs exist
  • src/fides/api/alembic/migrations/versions/xx_2026_02_11_1840_...py - Auto-generated migration

Steps to Confirm

All API calls can be made via the FastAPI docs page at http://localhost:8080/docs.
Please test with Fidesplus PR ENG-2537/assign-properties-to-datasets

Run fidesplus from that PR pointed at this fides PR.

Prerequisites: Create test data

  1. Create a connection config:

    PATCH /api/v1/connection

    [{"name": "Verify Props Connection", "key": "verify_props_conn", "connection_type": "postgres", "access": "read"}]
  2. Create a ctl_dataset:

    POST /api/v1/dataset

    {
      "fides_key": "verify_props_dataset",
      "organization_fides_key": "default_organization",
      "name": "Verify Props Dataset",
      "description": "Test dataset for property verification",
      "collections": [{
        "name": "users",
        "fields": [{"name": "id", "data_categories": ["system.operations"], "fides_meta": {"primary_key": true}}]
      }]
    }
  3. Create a property (for step 5 — requires Fidesplus):

    POST /api/v1/plus/property

    {"name": "Verify Test Property", "type": "website"}

    Note the returned id (e.g. FDS-XXXXXX) for step 5.

1. Verify upgrade migration

SELECT column_name, data_type, column_default
FROM information_schema.columns
WHERE table_name = 'datasetconfig' AND column_name = 'property_ids';
-- Expect: property_ids | ARRAY | '{}'::character varying[]

SELECT column_name, data_type, column_default
FROM information_schema.columns
WHERE table_name = 'manual_task_config' AND column_name = 'property_ids';
-- Expect: property_ids | ARRAY | '{}'::character varying[]

2. Verify GIN index

SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'datasetconfig' AND indexname = 'ix_datasetconfig_property_ids_gin';
-- Expect: one row with USING gin (property_ids)

3. PATCH without property_ids — existing behavior unchanged

PATCH /api/v1/connection/verify_props_conn/datasetconfig

[{"fides_key": "verify_props_dataset", "ctl_dataset_fides_key": "verify_props_dataset"}]

Expect: 200 with dataset in succeeded array. property_ids defaults to [] and is not modified.

4. PATCH with invalid property ID — rejected

PATCH /api/v1/connection/verify_props_conn/datasetconfig

[{"fides_key": "verify_props_dataset", "ctl_dataset_fides_key": "verify_props_dataset", "property_ids": ["FDS-FAKE123"]}]

Expect: 200 response with dataset in failed array, message containing "Unknown property IDs: ['FDS-FAKE123']".

5. PATCH with valid property ID — persisted

PATCH /api/v1/connection/verify_props_conn/datasetconfig

[{"fides_key": "verify_props_dataset", "ctl_dataset_fides_key": "verify_props_dataset", "property_ids": ["<PROPERTY_ID_FROM_STEP_3>"]}]

Expect: 200 with dataset in succeeded array.

Verify in the database:

SELECT fides_key, property_ids FROM datasetconfig WHERE fides_key = 'verify_props_dataset';
-- Expect: property_ids = {<PROPERTY_ID>}

6. Downgrade migration

# Inside the fides container:
cd /fides/src/fides/api/alembic && python -m alembic downgrade f85bd4c08401
SELECT column_name FROM information_schema.columns
WHERE table_name = 'datasetconfig' AND column_name = 'property_ids';
-- Expect: 0 rows

SELECT column_name FROM information_schema.columns
WHERE table_name = 'manual_task_config' AND column_name = 'property_ids';
-- Expect: 0 rows

SELECT indexname FROM pg_indexes
WHERE tablename = 'datasetconfig' AND indexname = 'ix_datasetconfig_property_ids_gin';
-- Expect: 0 rows

Note: Step 5 (valid property ID) requires Fidesplus for the property creation endpoint. If testing with Fides OSS only, step 4 still validates that invalid IDs are rejected.

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration label to the entry
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • No UX review needed
  • Followup issues:
    • Followup issues created (ENG-2537 Ticket 2: Property-Based DAG Filtering)
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
  • Documentation:
    • No documentation updates required

Add ARRAY(String) property_ids column to datasetconfig and
manual_task_config tables with GIN index for efficient array queries.
Update DatasetConfigCtlDataset schema and service to pass through
property_ids on PATCH. Empty array = universal access (backward
compatible, no bootstrap needed).

Co-authored-by: Cursor <cursoragent@cursor.com>
@vercel
Copy link
Contributor

vercel bot commented Feb 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
fides-plus-nightly Ready Ready Preview, Comment Feb 13, 2026 7:45pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
fides-privacy-center Ignored Ignored Feb 13, 2026 7:45pm

Request Review

Jade Wibbels and others added 3 commits February 11, 2026 12:11
@JadeCara JadeCara marked this pull request as ready for review February 11, 2026 22:18
@JadeCara JadeCara requested a review from a team as a code owner February 11, 2026 22:18
@JadeCara JadeCara requested review from galvana and removed request for a team February 11, 2026 22:18
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 11, 2026

Greptile Overview

Greptile Summary

This PR adds property_ids ARRAY(String) columns to datasetconfig and manual_task_config tables to enable property-based dataset scoping for future DAG filtering.

Key changes:

  • Added property_ids column to both tables with server_default='{}' for backward compatibility
  • Created GIN index on datasetconfig.property_ids for efficient array containment queries
  • Added optional property_ids field to DatasetConfigCtlDataset schema for PATCH pass-through
  • Service layer properly passes through property_ids when provided
  • Updated database dataset metadata and changelog with db-migration label

Implementation follows existing patterns:

  • Matches Client.scopes and StagedResource.classifications patterns for ARRAY(String) columns
  • Consistent with PrivacyRequest.property_id validation approach (validation in Fidesplus)
  • Uses nullable=False, server_default='{}', default=dict for safe defaults

This is a data model change only with no runtime behavior changes.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Schema-only change following established patterns. Migration has correct downrev, proper defaults ensure backward compatibility, GIN index enables future performance optimization, and validation will be handled in companion Fidesplus PR. No runtime behavior changes.
  • No files require special attention

Important Files Changed

Filename Overview
src/fides/api/alembic/migrations/versions/xx_2026_02_11_1840_c0dc13ad2a05_add_property_ids_to_datasetconfig_and_.py Adds property_ids ARRAY(String) column to datasetconfig and manual_task_config tables with GIN index on datasetconfig
src/fides/api/models/datasetconfig.py Adds property_ids column and GIN index to DatasetConfig model, following existing patterns for array columns
src/fides/api/models/manual_task/manual_task.py Adds property_ids column to ManualTaskConfig model matching DatasetConfig implementation
src/fides/service/dataset/dataset_config_service.py Adds logic to pass through property_ids from schema to model when provided in PATCH requests

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Jade Wibbels and others added 3 commits February 11, 2026 15:49
Prevents storing invalid property IDs through the dataset-configs
PATCH endpoint. Uses existing ValidationError handling to surface
errors in the bulk response.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Contributor

@galvana galvana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor typing updates

@galvana galvana self-requested a review February 13, 2026 19:36
@JadeCara JadeCara enabled auto-merge February 13, 2026 19:41
@JadeCara JadeCara added this pull request to the merge queue Feb 13, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 13, 2026
@JadeCara JadeCara added this pull request to the merge queue Feb 13, 2026
Merged via the queue into main with commit 1e54331 Feb 13, 2026
54 checks passed
@JadeCara JadeCara deleted the ENG-2537/assign-properties-to-datasets branch February 13, 2026 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants