Skip to content

chore: merge upstream datahub-project/master into acryldata/master (76 commits)#388

Merged
treff7es merged 77 commits into
masterfrom
sync/upstream-merge-2026-05-19
May 19, 2026
Merged

chore: merge upstream datahub-project/master into acryldata/master (76 commits)#388
treff7es merged 77 commits into
masterfrom
sync/upstream-merge-2026-05-19

Conversation

@treff7es
Copy link
Copy Markdown
Collaborator

Summary

Routine upstream sync — merging 76 commits from datahub-project/datahub master into acryldata/datahub master.

  • Conflicts: 1 (.github/workflows/docker-unified.yml)
  • Conflict resolution: Kept Acryl's on: workflow_dispatch override (the explicit "DO NOT OVERWRITE THIS CHANGE IN MERGES" banner). Dropped upstream's new playwright_shard_count input declaration; the single reference to it (github.event.inputs.playwright_shard_count || '4') has a working fallback.
  • Auto-merged in the same file (~285 lines, preserved): Playwright E2E test job + shard matrix, smoke-test profile resolution from smoke:<configKey> PR labels, DEPOT_PROJECT_ID parameterization via vars.DEPOT_PROJECT_ID.
  • Both Acryl overrides intact (the on: trigger gutting and the DOCKER_REGISTRY: "acryldata-do-not-publish" env block).

Latent issue flagged (pre-existing, not introduced here)

Line ~645 in docker-unified.yml references env.PROFILE_NAME, but Acryl's existing override deleted that env var. Empty value on manual dispatch. Same behavior as before this merge — worth a follow-up for whoever owns the workflow override.

Upstream commits included

Highlights from the 76 commits (full list via git log master..sync/upstream-merge-2026-05-19):

Test plan

  • CI passes on this branch
  • Spot-check that Acryl-specific docker-unified.yml overrides still appear in the workflow file at HEAD
  • Manual workflow_dispatch of docker-unified.yml (if needed for verification) still behaves as before

🤖 Generated with Claude Code

askumar27 and others added 30 commits May 12, 2026 09:10
Co-authored-by: Mihai Ciocirdel <mihai.ciocirdel1@swisscom.com>
Co-authored-by: Devashish Chandra <devashish2203@users.noreply.github.com>
Co-authored-by: mihai103 <mihai103@noreply.com>
…ustom artifact server (datahub-project#17381)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…hub-project#17403)

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Jay <jayacryl@users.noreply.github.com>
…eps (datahub-project#17422)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…llback

When updating labels on an existing issue, Linear rejects the mutation if
the new ref label (e.g. v1.0.1-cloud) is in the same label group as an
existing ref label (e.g. v1.0.0-cloud). On collision, create/reuse a
team-scoped label with a -sec suffix to avoid the workspace group constraint.

Protects both the existing-issue update path and the new-issue create path.

Co-authored-by: Cursor <cursoragent@cursor.com>
…Snowflake, Databricks, BigQuery, Redshift, and Postgres (datahub-project#17396)

Co-authored-by: Claude <noreply@anthropic.com>
lakshay-nasa and others added 7 commits May 19, 2026 18:51
Brings in 76 upstream commits from datahub-project/datahub master.

Conflict resolution:
- .github/workflows/docker-unified.yml: kept Acryl's `on: workflow_dispatch`
  override (the explicit "DO NOT OVERWRITE THIS CHANGE IN MERGES" banner).
  Upstream's new `playwright_shard_count` input was dropped from the
  inputs declaration; the one reference to it uses `|| '4'` as a default,
  so the Playwright matrix step still works on manual dispatch.

All other upstream additions to that file (~285 lines: Playwright E2E
job, smoke profile resolution from PR labels, DEPOT_PROJECT_ID
parameterization) auto-merged and were preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- name: Build Playwright shard matrix
id: set-playwright-matrix
if: ${{ github.event_name != 'pull_request' && (steps.ci-optimize.outputs.backend-change == 'true' || steps.ci-optimize.outputs.frontend-change == 'true' || steps.ci-optimize.outputs.playwright-change == 'true') }}
run: |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [actionlint] reported by reviewdog 🐶
property "playwright_shard_count" is not defined in object type {} [expression]

"""Return ``(topic_id, partition_count, default_content_type_id)`` or None."""
with conn.cursor() as cur:
cur.execute(
f"SELECT id, partition_count, default_content_type_id FROM {self._topic} WHERE topic_name = %s",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

def _ensure_mime_registered(self, conn: PGConnection, mime: str) -> int:
with conn.cursor() as cur:
cur.execute(
f"INSERT INTO {self._content_type} (mime) VALUES (%s) ON CONFLICT (mime) DO NOTHING",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

(mime,),
)
cur.execute(
f"SELECT id FROM {self._content_type} WHERE mime = %s",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

),
)
cur.execute(
f"SELECT id FROM {self._topic} WHERE topic_name = %s",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

with conn.cursor() as cur:
cur.execute(
f"""
SELECT offset_value FROM {self._consumer_offset}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

cur.execute(
f"""
SELECT m.id, m.enqueued_at, m.topic_id, m.partition_id, m.enqueue_seq
FROM {self._message} m
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

with conn.cursor() as cur:
cur.execute(
f"""
INSERT INTO {lease}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

with conn.cursor() as cur:
cur.execute(
f"""
SELECT m.priority, m.payload, {ctype_expr} AS content_type,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

delete_params.extend([h.id, h.enqueued_at])
cur.execute(
f"""
DELETE FROM {self._lease}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

for part_id, seq in pmap.items():
cur.execute(
f"""
INSERT INTO {self._consumer_offset} AS co
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

params.extend([consumer_group, lock_owner])
cur.execute(
f"""
UPDATE {self._lease} AS l
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

with conn.cursor() as cur:
cur.execute(
f"""
INSERT INTO {self._topic} AS ptopic
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

cur.execute(
f"""
SELECT partition_id, COALESCE(MAX(enqueue_seq), 0)
FROM {self._message}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL injection via string-based query concatenation - critical severity
SQL injection might be possible in these locations, especially if the strings being concatenated are controlled via user input.

Show fix

Remediation: If possible, rebuild the query to use prepared statements or an ORM. If that is not possible, make sure the user input is verified or sanitized. As an added layer of protection, we also recommend installing a WAF that blocks SQL injection attacks.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Bundle Report

Changes will increase total bundle size by 77.08kB (0.34%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 23.06MB 77.08kB (0.34%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 15.2kB 8.66MB 0.18%
assets/matillionlogo-*.png (New) 37.34kB 37.34kB 100.0% 🚀
assets/airbytelogo-*.png (New) 24.54kB 24.54kB 100.0% 🚀

Copy link
Copy Markdown

@david-leifker david-leifker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workflow changes around profile name seem ok to me. I'd let others speak for playwright, etc

@treff7es treff7es merged commit dbadffc into master May 19, 2026
84 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.