Conversation
ignore bq_inserted_at column of the binlog bq table
Redact the values of some columns before syncing to bq binlogs
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 6 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
| int numWorkers = this.workers.size(); | ||
| TableSchema tableSchema = getTableSchema(tableName); | ||
| // Create and start workers | ||
| for (int i = 0; i < Math.max(1, numWorkers); i++) { |
There was a problem hiding this comment.
startWorkers always creates exactly one worker
High Severity
startWorkers reads numWorkers from this.workers.size(), but this.workers was just initialized as new ArrayList<>(bigqueryThreads) — which sets capacity, not size. So numWorkers is always 0, and Math.max(1, numWorkers) is always 1. The bigqueryThreads configuration is effectively ignored and only a single worker is ever started.
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
| this.attemptBatch(this.appendContext); | ||
| this.appendContext = null; | ||
| } | ||
| } |
There was a problem hiding this comment.
Race condition on appendContext outside synchronized block
High Severity
this.appendContext is read and modified at lines 370–373 without holding the lock, while the scheduleAttempt scheduled task concurrently calls attemptBatch(this.appendContext) and sets this.appendContext = null inside the lock on a different thread. This creates a race where the worker thread can dereference a null appendContext, causing a NullPointerException, or concurrently mutate the same AppendContext (e.g., addRow vs append).
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
| return; | ||
| } catch (Exception e) { | ||
| System.out.format("Failed to retry append: %s\n", e); | ||
| System.out.format("Worker {} Failed to retry append: %s\n", parent.getWorkerId(), e); |
There was a problem hiding this comment.
System.out.format uses wrong format placeholders
Low Severity
System.out.format("Worker {} Failed to retry append: %s\n", parent.getWorkerId(), e) mixes SLF4J-style {} with printf-style %s. Since System.out.format uses printf conventions, {} is literal text, %s consumes parent.getWorkerId(), and the exception e is silently ignored. The error message will be misleading and the exception details will be lost.
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
| && tar -xzf /tmp/async-profiler.tar.gz -C /opt \ | ||
| && rm /tmp/async-profiler.tar.gz | ||
| ENV ASYNC_PROFILER_HOME=/opt/async-profiler-${ASYNC_PROFILER_VERSION}-linux-x64 | ||
| ENV PATH="$PATH:${ASYNC_PROFILER_HOME}" |
There was a problem hiding this comment.
Dockerfile runs as root with debug tools installed
Medium Severity
The non-root USER 1000 directive is commented out, and development/debug tools (htop, procps, python3-pip, wget, async-profiler) are installed in the production image. The MAXWELL_VERSION variable used in echo "$MAXWELL_VERSION" > /REVISION is not defined in this build stage (only in the builder stage), so /REVISION will be empty. These appear to be development-time changes not intended for production.
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
| try { | ||
| synchronized (this.getLock()) { | ||
| this.attemptBatch(this.appendContext); | ||
| this.appendContext = null; // Nullify after attempting via scheduler |
There was a problem hiding this comment.
Scheduled task references field instead of captured parameter
High Severity
Inside scheduleAttempt, the scheduled lambda uses this.appendContext (the instance field) instead of the captured appendContext parameter. When the task fires after 1 minute, the field may have already been flushed and set to null (causing NPE in attemptBatch), or it may point to a completely different AppendContext that's still being built — causing a premature or duplicate flush of the wrong batch.
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
|
|
||
| RUN useradd -u 1000 maxwell -d /app | ||
| RUN chown 1000:1000 /app | ||
| RUN echo "$MAXWELL_VERSION" > /REVISION |
There was a problem hiding this comment.
Dockerfile writes empty REVISION in second stage
Low Severity
RUN echo "$MAXWELL_VERSION" > /REVISION runs in the second build stage where MAXWELL_VERSION is not defined (it's an ENV only in the builder stage). This writes an empty string to /REVISION, whereas the original COPY --from=builder /REVISION /REVISION correctly propagated the version.
Reviewed by Cursor Bugbot for commit 5a0cc82. Configure here.
|
You have used all of your free Bugbot PR reviews. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |


Note
High Risk
High risk because it substantially refactors the BigQuery producer’s concurrency/batching/acknowledgement behavior and changes the Docker runtime base image and packaging, which can affect throughput, ordering, and operational stability.
Overview
BigQuery producer behavior is overhauled to support configurable
bigquery_threads, multi-worker ingestion, and batch appends (up toBATCH_SIZE/ approximate byte limits) with a scheduled flush; callbacks now complete per-row within a batch and retry at the batch level.BigQuery output handling is extended by filtering
bq_inserted_atfrom the write schema, adding support for BigQueryJSONfield type conversion, skipping oversized records, and optionally clearing configured columns viacleared_columnsenv var.Build/runtime changes update the default Kafka client version to
3.7.2and switch the Docker runtime image toeclipse-temurin:23-jre-noble, adding debugging/profiling tooling (apt utilities +async-profiler) and changing how/REVISIONis produced (no longer copied from builder).Reviewed by Cursor Bugbot for commit 5a0cc82. Bugbot is set up for automated code reviews on this repo. Configure here.