chore: merge upstream datahub-project/master into acryldata/master (76 commits)#388
Conversation
Co-authored-by: Mihai Ciocirdel <mihai.ciocirdel1@swisscom.com> Co-authored-by: Devashish Chandra <devashish2203@users.noreply.github.com> Co-authored-by: mihai103 <mihai103@noreply.com>
…ustom artifact server (datahub-project#17381) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
datahub-project#17398) Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…hub-project#17403) Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Jay <jayacryl@users.noreply.github.com>
…project#17410) Co-authored-by: Claude <noreply@anthropic.com>
…project#17408) Co-authored-by: Claude <noreply@anthropic.com>
…eps (datahub-project#17422) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…llback When updating labels on an existing issue, Linear rejects the mutation if the new ref label (e.g. v1.0.1-cloud) is in the same label group as an existing ref label (e.g. v1.0.0-cloud). On collision, create/reuse a team-scoped label with a -sec suffix to avoid the workspace group constraint. Protects both the existing-issue update path and the new-issue create path. Co-authored-by: Cursor <cursoragent@cursor.com>
…estionSourceSchedule (datahub-project#17416)
datahub-project#17433) Co-authored-by: Cursor <cursoragent@cursor.com>
datahub-project#17431) Co-authored-by: Cursor <cursoragent@cursor.com>
…ectors and field-level transform support (datahub-project#16515)
…Snowflake, Databricks, BigQuery, Redshift, and Postgres (datahub-project#17396) Co-authored-by: Claude <noreply@anthropic.com>
…lumn-path.spec) test (datahub-project#17438)
datahub-project#17430) Co-authored-by: Cursor <cursoragent@cursor.com>
datahub-project#17493) Co-authored-by: Cursor <cursoragent@cursor.com>
Brings in 76 upstream commits from datahub-project/datahub master. Conflict resolution: - .github/workflows/docker-unified.yml: kept Acryl's `on: workflow_dispatch` override (the explicit "DO NOT OVERWRITE THIS CHANGE IN MERGES" banner). Upstream's new `playwright_shard_count` input was dropped from the inputs declaration; the one reference to it uses `|| '4'` as a default, so the Playwright matrix step still works on manual dispatch. All other upstream additions to that file (~285 lines: Playwright E2E job, smoke profile resolution from PR labels, DEPOT_PROJECT_ID parameterization) auto-merged and were preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| - name: Build Playwright shard matrix | ||
| id: set-playwright-matrix | ||
| if: ${{ github.event_name != 'pull_request' && (steps.ci-optimize.outputs.backend-change == 'true' || steps.ci-optimize.outputs.frontend-change == 'true' || steps.ci-optimize.outputs.playwright-change == 'true') }} | ||
| run: | |
There was a problem hiding this comment.
🚫 [actionlint] reported by reviewdog 🐶
property "playwright_shard_count" is not defined in object type {} [expression]
| """Return ``(topic_id, partition_count, default_content_type_id)`` or None.""" | ||
| with conn.cursor() as cur: | ||
| cur.execute( | ||
| f"SELECT id, partition_count, default_content_type_id FROM {self._topic} WHERE topic_name = %s", |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| def _ensure_mime_registered(self, conn: PGConnection, mime: str) -> int: | ||
| with conn.cursor() as cur: | ||
| cur.execute( | ||
| f"INSERT INTO {self._content_type} (mime) VALUES (%s) ON CONFLICT (mime) DO NOTHING", |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| (mime,), | ||
| ) | ||
| cur.execute( | ||
| f"SELECT id FROM {self._content_type} WHERE mime = %s", |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| ), | ||
| ) | ||
| cur.execute( | ||
| f"SELECT id FROM {self._topic} WHERE topic_name = %s", |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| with conn.cursor() as cur: | ||
| cur.execute( | ||
| f""" | ||
| SELECT offset_value FROM {self._consumer_offset} |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| cur.execute( | ||
| f""" | ||
| SELECT m.id, m.enqueued_at, m.topic_id, m.partition_id, m.enqueue_seq | ||
| FROM {self._message} m |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| with conn.cursor() as cur: | ||
| cur.execute( | ||
| f""" | ||
| INSERT INTO {lease} |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| with conn.cursor() as cur: | ||
| cur.execute( | ||
| f""" | ||
| SELECT m.priority, m.payload, {ctype_expr} AS content_type, |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| delete_params.extend([h.id, h.enqueued_at]) | ||
| cur.execute( | ||
| f""" | ||
| DELETE FROM {self._lease} |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| for part_id, seq in pmap.items(): | ||
| cur.execute( | ||
| f""" | ||
| INSERT INTO {self._consumer_offset} AS co |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| params.extend([consumer_group, lock_owner]) | ||
| cur.execute( | ||
| f""" | ||
| UPDATE {self._lease} AS l |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| with conn.cursor() as cur: | ||
| cur.execute( | ||
| f""" | ||
| INSERT INTO {self._topic} AS ptopic |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
| cur.execute( | ||
| f""" | ||
| SELECT partition_id, COALESCE(MAX(enqueue_seq), 0) | ||
| FROM {self._message} |
There was a problem hiding this comment.
Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.
Show fix
Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
Bundle ReportChanges will increase total bundle size by 77.08kB (0.34%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: datahub-react-web-esmAssets Changed:
|
david-leifker
left a comment
There was a problem hiding this comment.
Workflow changes around profile name seem ok to me. I'd let others speak for playwright, etc
Summary
Routine upstream sync — merging 76 commits from
datahub-project/datahubmaster intoacryldata/datahubmaster..github/workflows/docker-unified.yml)on: workflow_dispatchoverride (the explicit "DO NOT OVERWRITE THIS CHANGE IN MERGES" banner). Dropped upstream's newplaywright_shard_countinput declaration; the single reference to it (github.event.inputs.playwright_shard_count || '4') has a working fallback.smoke:<configKey>PR labels,DEPOT_PROJECT_IDparameterization viavars.DEPOT_PROJECT_ID.on:trigger gutting and theDOCKER_REGISTRY: "acryldata-do-not-publish"env block).Latent issue flagged (pre-existing, not introduced here)
Line ~645 in
docker-unified.ymlreferencesenv.PROFILE_NAME, but Acryl's existing override deleted that env var. Empty value on manual dispatch. Same behavior as before this merge — worth a follow-up for whoever owns the workflow override.Upstream commits included
Highlights from the 76 commits (full list via
git log master..sync/upstream-merge-2026-05-19):feat(upgrade): re-apply retention policies on system-update (feat(upgrade): re-apply retention policies on system-update when conf… datahub-project/datahub#17493)fix(ingest/snowflake): allow overriding Snowsight base URL for private link (fix(ingest/snowflake): allow overriding Snowsight base URL for private link datahub-project/datahub#17502)fix(security): home page templates/module scope hijacking (fix(security): Fix home page templates/module scope highjacking datahub-project/datahub#17487)fix(security): validate URLs before rendering links (fix(ui): Fix directing users to SSO from signup link datahub-project/datahub#17492)fix(ui): directing users to SSO from signup link (fix(ui): Fix directing users to SSO from signup link datahub-project/datahub#17492)feat(ingestion/airbyte): Airbyte connector (feat(ingestion/airbyte): Airbyte Connector datahub-project/datahub#13217)feat(ingest/unity): opt-in Databricks Unity Catalog Metric View supportfeat(pgqueue): metadata-ingestion sink and actions pg_queue event sourcefeat(k8s): KEDA-aware scaling support for KubernetesControllerci(playwright): integrate Playwright E2E tests into docker-unified pipeline (ci(playwright): integrate Playwright E2E tests into docker-unified CI pipeline datahub-project/datahub#17361)feat(ci): drive unified smoke builds fromsmoke:<configKey>PR labels (feat(ci): drive unified smoke builds from smoke:<configKey> PR labels datahub-project/datahub#17470)Test plan
workflow_dispatchofdocker-unified.yml(if needed for verification) still behaves as before🤖 Generated with Claude Code