feat: add database scalability labs with interactive visualizer#42
Conversation
Add 4 labs for Module 10 (Databases) covering replication, ACID transactions, indexing, and schema design across MySQL, MongoDB, and Cassandra. Main lab: interactive web visualizer connected to a live MySQL primary-replica cluster with animated SVG diagrams, real-time latency measurement, built-in SQL console, and EC2 CloudFormation support. Three tabs: Replication, Consistency (ACID), Schema & Indexing. Optional CLI labs: MySQL (GTID replication, ACID, EXPLAIN), MongoDB (replica set, read/write concerns, denormalization), Cassandra (multi-node ring, tunable consistency, partition keys). Each lab includes docker-compose.yml, setup/cleanup scripts, CloudFormation template for EC2 deployment, and step-by-step instructions following the 13-section lab template.
📝 WalkthroughWalkthroughAdds Lab 10 “Database Scalability”: MySQL, MongoDB, Cassandra labs plus an interactive visualizer. Introduces Docker Compose stacks, setup/cleanup scripts, CloudFormation templates, SQL/CQL/JS seed/init files, frontend assets (HTML/CSS/JS), and a Python multithreaded server exposing JSON APIs for visualizer interactions. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User (Browser)
participant FrontEnd as Visualizer Frontend
participant Server as Python Server
participant Primary as MySQL Primary
participant Replica as MySQL Replica
User->>FrontEnd: Click "Write & Replicate"
FrontEnd->>Server: POST /api/replication/write
Server->>Primary: INSERT student record
Primary-->>Server: Insert OK
Server->>Replica: SELECT by student_id
alt Row present
Replica-->>Server: Row found
Server->>Replica: SHOW REPLICA STATUS
Replica-->>Server: Seconds_Behind_Source
else Not yet replicated
Replica-->>Server: No row
Server-->>Server: mark lagging
end
Server-->>FrontEnd: steps + lag
FrontEnd->>User: Animate & display result
sequenceDiagram
participant User as User (Browser)
participant FrontEnd as Visualizer Frontend
participant Server as Python Server
participant Primary as MySQL Primary
User->>FrontEnd: Click "ACID Transfer"
FrontEnd->>Server: POST /api/consistency/transfer
Server->>Primary: BEGIN
Primary-->>Server: OK
Server->>Primary: DELETE (origin)
Primary-->>Server: rows deleted
Server->>Primary: INSERT (destination)
Primary-->>Server: insert OK
alt Constraint violation
Server->>Primary: COMMIT -> error
Primary-->>Server: constraint error
Server->>Primary: ROLLBACK
Primary-->>Server: rolled back
else Success
Server->>Primary: COMMIT
Primary-->>Server: committed
end
Server-->>FrontEnd: step trace & outcome
sequenceDiagram
participant User as User (Browser)
participant FrontEnd as Visualizer Frontend
participant Server as Python Server
participant Primary as MySQL Primary
User->>FrontEnd: Click "Analyze Query"
FrontEnd->>Server: POST /api/schema/explain (params)
Server->>Primary: EXPLAIN SELECT ...
Primary-->>Server: plan
Server->>Primary: SELECT COUNT(*) (timed)
Primary-->>Server: rows scanned
Server-->>FrontEnd: plan + row count
User->>FrontEnd: Click "Add Index"
FrontEnd->>Server: POST /api/schema/add-index
Server->>Primary: CREATE INDEX ...
Primary-->>Server: index created
User->>FrontEnd: Re-run "Analyze Query"
FrontEnd->>Server: POST /api/schema/explain
Server->>Primary: EXPLAIN SELECT ...
Primary-->>Server: optimized plan
Server-->>FrontEnd: improved plan + reduced rows scanned
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
- Use contextlib.suppress instead of try/except/pass (ruff SIM105) - Use strict equality (!==) instead of loose (!=) (eslint eqeqeq) - Exclude MongoDB shell scripts from ESLint (not browser JS)
There was a problem hiding this comment.
Pull request overview
This PR adds Module 10 “Databases” lab content, centered on an interactive MySQL primary/replica visualizer (web UI + Python API bridge), plus optional CLI labs for MySQL, MongoDB, and Cassandra, along with updated module docs and supporting Docker/CloudFormation scaffolding.
Changes:
- Add an interactive browser-based visualizer backed by a live MySQL primary/replica cluster (Docker Compose + Python API + HTML/JS/CSS).
- Add optional CLI-based labs for MySQL, MongoDB, and Cassandra (compose files, setup/cleanup scripts, CloudFormation templates, and walkthrough docs).
- Update Module 10 README/LAB index and adjust pre-commit YAML validation exclusions and secrets baseline.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| 10-databases/visualizer/style.css | Visualizer UI styling (dark theme layout/components). |
| 10-databases/visualizer/index.html | Visualizer UI structure (tabs, SVG diagram, SQL console, sidebar). |
| 10-databases/visualizer/app.js | Front-end logic: animations, polling, event log, SQL console. |
| 10-databases/visualizer/server.py | Python HTTP server + API routes bridging UI actions to MySQL. |
| 10-databases/visualizer/Dockerfile | Container image for the Python API server. |
| 10-databases/visualizer/docker-compose.yml | Compose stack for MySQL primary/replica + visualizer. |
| 10-databases/visualizer/setup.sh | Starts compose stack and configures MySQL replication. |
| 10-databases/visualizer/cleanup.sh | Tears down visualizer stack and volumes. |
| 10-databases/visualizer/cloudformation.yaml | EC2 deployment template for running the visualizer remotely. |
| 10-databases/visualizer/LAB-VISUALIZER.md | Main lab instructions for the interactive visualizer. |
| 10-databases/mysql/docker-compose.yml | Optional MySQL CLI lab compose stack (primary/replica). |
| 10-databases/mysql/init/primary-init.sql | MySQL schema + seed data (incl. access_log for indexing demo). |
| 10-databases/mysql/setup.sh | Optional MySQL lab setup incl. replication configuration. |
| 10-databases/mysql/cleanup.sh | Optional MySQL lab teardown. |
| 10-databases/mysql/cloudformation.yaml | EC2 template for MySQL CLI lab. |
| 10-databases/mysql/LAB-MYSQL.md | Optional MySQL CLI walkthrough. |
| 10-databases/mongodb/docker-compose.yml | Optional MongoDB replica set compose stack. |
| 10-databases/mongodb/init/rs-init.js | Replica set init + seed data for MongoDB lab. |
| 10-databases/mongodb/setup.sh | MongoDB lab setup (start nodes, init RS, seed). |
| 10-databases/mongodb/cleanup.sh | MongoDB lab teardown. |
| 10-databases/mongodb/cloudformation.yaml | EC2 template for MongoDB lab. |
| 10-databases/mongodb/LAB-MONGODB.md | Optional MongoDB walkthrough. |
| 10-databases/cassandra/docker-compose.yml | Optional Cassandra 3-node cluster compose stack. |
| 10-databases/cassandra/init/schema.cql | Cassandra schema + seed data for partition-key exercises. |
| 10-databases/cassandra/setup.sh | Cassandra lab setup (wait for ring, load schema). |
| 10-databases/cassandra/cleanup.sh | Cassandra lab teardown. |
| 10-databases/cassandra/cloudformation.yaml | EC2 template for Cassandra lab. |
| 10-databases/cassandra/LAB-CASSANDRA.md | Optional Cassandra walkthrough. |
| 10-databases/README.md | Module 10 README rewritten to match the new lab structure. |
| 10-databases/LAB.md | Module 10 lab index + comparison table across DB paradigms. |
| .pre-commit-config.yaml | Adds check-yaml exclusions for Module 10 CloudFormation files. |
| .secrets.baseline | Updates detect-secrets baseline for new lab files/strings. |
- Replace deprecated isMaster with hello command (MongoDB) - Fail loudly on replica set init error instead of ignoring - Add Docker Compose version check to visualizer setup.sh - Remove SET/USE from SQL console allowlist (prevents state changes) - Fix conn.autocommit usage (method call, not property assignment) - Wrap JSON body parsing in try/except for proper error response
There was a problem hiding this comment.
Actionable comments posted: 11
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (11)
10-databases/mysql/cleanup.sh-10-14 (1)
10-14:⚠️ Potential issue | 🟡 MinorDon't silently report a successful cleanup.
Line 11 and Line 14 discard stderr and force success, so the script still prints
Cleanup completewhen Docker is missing or the resources were not removed. Keep the cleanup idempotent, but surface a warning on failure so students know the lab may not be fully reset.
⚠️ Proposed fixecho "Stopping containers..." -docker compose down -v --remove-orphans 2>/dev/null || true +if ! docker compose down -v --remove-orphans; then + echo "Warning: failed to stop/remove MySQL lab containers." >&2 +fi echo "Removing volumes..." -docker volume rm mysql_primary-data mysql_replica-data 2>/dev/null || true +if ! docker volume rm mysql_primary-data mysql_replica-data >/dev/null 2>&1; then + echo "Warning: failed to remove one or more MySQL lab volumes." >&2 +fiAs per coding guidelines, "Review for shell scripting best practices, proper error handling, and shellcheck/shellharden compliance."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/mysql/cleanup.sh` around lines 10 - 14, The cleanup script currently swallows errors for the commands "docker compose down -v --remove-orphans" and "docker volume rm mysql_primary-data mysql_replica-data" by redirecting stderr and using "|| true", causing a misleading "Cleanup complete" message; change each command to preserve idempotency but detect failures (remove "2>/dev/null || true"), capture the exit status, and if non‑zero emit a clear warning (e.g., "Warning: docker compose down failed: <error>" / "Warning: docker volume rm failed: <error>") while still allowing the script to continue, so students see when resources weren’t removed; reference the two commands above when making the changes.10-databases/mongodb/init/rs-init.js-31-32 (1)
31-32:⚠️ Potential issue | 🟡 MinorUse direct assignment to switch databases instead of shadowing with
const.Line 32 shadows the global
dbwith aconst dbdeclaration. While this does not trigger a temporal dead zone error (the right-hand side resolves to the globaldbsuccessfully), shadowing the globaldbis not a recommended pattern. Since subsequent code only needs the "university" database, use direct assignment instead:db = db.getSiblingDB("university"). This is the pattern recommended by MongoDB documentation for scripts and avoids shadowing the global object.🛠️ Recommended fix
-const db = db.getSiblingDB("university"); +db = db.getSiblingGB("university");Alternatively, if working with multiple databases simultaneously, use a different variable name:
const universityDb = db.getSiblingDB("university")and update references accordingly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/mongodb/init/rs-init.js` around lines 31 - 32, The current line declares a new const db that shadows the global db ("const db = db.getSiblingDB('university')"); change this to assign to the existing global db (use db = db.getSiblingDB("university")) or, if you need both databases, rename the variable to a distinct identifier (e.g., universityDb = db.getSiblingDB("university")) and update subsequent references; update the line using the getSiblingDB call accordingly to avoid shadowing.10-databases/cassandra/setup.sh-65-67 (1)
65-67:⚠️ Potential issue | 🟡 MinorKeep
cqlshstderr visible during schema load.If this step fails,
set -ewill stop the script, but2>/dev/nullhides the only useful error message.📝 Suggested fix
docker cp init/schema.cql cass1:/tmp/schema.cql -docker exec cass1 cqlsh -f /tmp/schema.cql 2>/dev/null +docker exec cass1 cqlsh -f /tmp/schema.cql🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/cassandra/setup.sh` around lines 65 - 67, The schema load hides cqlsh errors by redirecting stderr in the command 'docker exec cass1 cqlsh -f /tmp/schema.cql 2>/dev/null'; remove the '2>/dev/null' redirection (or alternatively redirect only stdout) so cqlsh stderr is printed during 'docker exec cass1 cqlsh -f /tmp/schema.cql', ensuring any failure messages are visible when the script stops under set -e.10-databases/mongodb/LAB-MONGODB.md-299-299 (1)
299-299:⚠️ Potential issue | 🟡 MinorPoint students to the current PRIMARY, not always
mongo1.After an election,
mongo1may be secondary or unavailable, so this troubleshooting row can send users to the wrong node.📝 Suggested fix
-| `MongoServerError: not primary` | Writing to a secondary | Connect to mongo1 (the initial primary) | +| `MongoServerError: not primary` | Writing to a secondary | Run `rs.status()` and connect to the current PRIMARY |🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/mongodb/LAB-MONGODB.md` at line 299, The troubleshooting row that tells users to "Connect to mongo1 (the initial primary)" is unsafe because after an election mongo1 may not be primary; update the guidance for the "MongoServerError: not primary" entry to instruct users to identify and connect to the current primary (e.g., using rs.status()/rs.isMaster()/isWritablePrimary checks or the mongo shell's primary discovery) instead of hardcoding "mongo1", and replace the literal reference to "mongo1" with a note to connect to the node reported as PRIMARY by the replica set status.10-databases/mongodb/LAB-MONGODB.md-115-120 (1)
115-120:⚠️ Potential issue | 🟡 MinorHandle replication lag in the secondary-read example.
If Step 2.2 runs before
mongo2has applied the insert,findOne()returnsnullandzoe.namethrows. Add a null guard so the step teaches lag instead of crashing.💡 Suggested fix
docker exec mongo2 mongosh --quiet --eval " db.getMongo().setReadPref('secondary'); use('university'); const zoe = db.students.findOne({ name: 'Zoe Adams' }); -print('Found on secondary: ' + zoe.name + ' (' + zoe.major + ')'); +if (zoe) { + print('Found on secondary: ' + zoe.name + ' (' + zoe.major + ')'); +} else { + print('Not visible on mongo2 yet. Wait a second and rerun this step.'); +} "🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/mongodb/LAB-MONGODB.md` around lines 115 - 120, The secondary-read example can crash if the replica hasn't applied the insert because zoe will be null; update the mongosh snippet that calls db.getMongo().setReadPref('secondary') and db.students.findOne({ name: 'Zoe Adams' }) to guard the result: check the zoe variable for null/undefined and only access zoe.name and zoe.major when present, otherwise print a clear message indicating the document wasn't found on the secondary (replication lag) so the step demonstrates lag instead of throwing.10-databases/README.md-21-30 (1)
21-30:⚠️ Potential issue | 🟡 MinorUse repo-relative paths in the runnable steps.
cd visualizerandvisualizer/cloudformation.yamlonly work if the reader has already changed into10-databases. Point to10-databases/visualizer/...explicitly so the instructions work from the repo root as written. As per coding guidelines,Directory paths in instructions must match actual directory names using kebab-case with NN- prefix (e.g., \03-load-balancing-haproxy`)`.Also applies to: 121-124, 154-157
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/README.md` around lines 21 - 30, Update the repo-relative paths so the runnable steps work from the repository root: replace the `cd visualizer` step with `cd 10-databases/visualizer` (and similarly update the `./setup.sh` invocation to reference the setup script under 10-databases/visualizer), change the `visualizer/cloudformation.yaml` reference to `10-databases/visualizer/cloudformation.yaml`, and update the `LAB.md` link to point to the 10-databases copy; also make the same fixes for the other occurrences noted (the other references to `visualizer/...` in the file) so all path references are repo-root-relative.10-databases/visualizer/index.html-271-273 (1)
271-273:⚠️ Potential issue | 🟡 MinorAdd an explicit accessible name to the SQL query field.
This input is currently placeholder-only, so once the placeholder disappears the field has no reliable label for screen readers. An
aria-labelor hidden<label>keeps the console usable.♿ Suggested fix
- <input type="text" id="console-input" placeholder="SELECT * FROM students LIMIT 5;" /> + <input type="text" id="console-input" aria-label="SQL query" placeholder="SELECT * FROM students LIMIT 5;" />🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/index.html` around lines 271 - 273, The SQL input with id "console-input" is placeholder-only and missing an accessible name; add an explicit label (either a visually hidden <label for="console-input">SQL query</label> placed before the input or an aria-label/aria-labelledby attribute on the input, e.g., aria-label="SQL query input") so screen readers can identify the field, and ensure the label text clearly describes purpose (e.g., "SQL query" or "Console SQL input").10-databases/visualizer/LAB-VISUALIZER.md-69-76 (1)
69-76:⚠️ Potential issue | 🟡 MinorUse a versioned/local template source instead of
main.That URL can drift away from the repo snapshot a learner is following, so the EC2 path stops being reproducible over time. Point them at
10-databases/visualizer/cloudformation.yamlfrom the same checkout/tag instead. As per coding guidelines,[0-1][0-9]-*/**: Review lab content for technical accuracy, clear step-by-step instructions, and educational value.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/LAB-VISUALIZER.md` around lines 69 - 76, The link in Step 2 currently points to the raw `main` branch URL for cloudformation.yaml which can drift; update the step to reference the versioned/local template `10-databases/visualizer/cloudformation.yaml` from the same checkout or tag the learner is using (or provide an explicit tag/commit SHA) instead of the `https://raw.githubusercontent.com/.../main/10-databases/visualizer/cloudformation.yaml` URL so the EC2 setup is reproducible.10-databases/cassandra/cloudformation.yaml-30-30 (1)
30-30:⚠️ Potential issue | 🟡 MinorInstance type
t3.largeis not AWS Free Tier eligible.Per coding guidelines, lab content should work in a standard AWS Free Tier environment. The
t3.largeinstance incurs charges (~$0.0832/hour in us-east-1). The description explains the memory requirement for a 3-node Cassandra cluster, but consider adding a cost warning in the lab documentation or offering a reduced-memory alternative.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/cassandra/cloudformation.yaml` at line 30, The template currently hardcodes InstanceType: t3.large which is not AWS Free Tier eligible; update the CloudFormation to either use a Free Tier instance (e.g., t3.micro) or parameterize the instance type so labs default to a free-tier-safe value and allow overriding for higher-memory needs—adjust the resource referencing InstanceType (the InstanceType property in the EC2/NodeGroup resource) and add a short lab doc note/warning about cost and a suggested larger-instance alternative for realistic 3-node Cassandra testing.10-databases/visualizer/cloudformation.yaml-35-35 (1)
35-35:⚠️ Potential issue | 🟡 MinorInstance type
t3.mediumis not AWS Free Tier eligible.Per coding guidelines, lab content should work in AWS Free Tier. The
t3.mediuminstance incurs charges. Consider documenting the expected cost or testing ift3.micro(free tier) can handle the MySQL primary-replica plus visualizer workload.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/cloudformation.yaml` at line 35, The CloudFormation template currently sets the EC2 InstanceType to t3.medium which is not Free Tier eligible; update the InstanceType value used in the template (the InstanceType property for the visualizer/DB instance) to a Free Tier option such as t3.micro (or parameterize it and default to t3.micro) and then test the MySQL primary-replica plus visualizer workload for performance; if t3.micro cannot sustain the workload, add a short comment or README entry documenting expected costs and recommending an appropriate instance type (e.g., t3.small/t3.medium) for non-free-tier usage.10-databases/visualizer/app.js-236-237 (1)
236-237:⚠️ Potential issue | 🟡 MinorUse strict equality
!==instead of!=.Static analysis flagged this. The loose equality
!=can produce unexpected results with type coercion.Proposed fix
- const lag = state.replica?.lag; - $('#stat-lag').textContent = lag != null ? lag : '--'; + const lag = state.replica?.lag; + $('#stat-lag').textContent = lag !== null && lag !== undefined ? lag : '--';Or more idiomatically:
- $('#stat-lag').textContent = lag != null ? lag : '--'; + $('#stat-lag').textContent = lag ?? '--';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/app.js` around lines 236 - 237, Replace the loose null check in the DOM update for $('#stat-lag').textContent by using strict inequality: change the comparison in the block that updates $('#stat-lag').textContent to use !== null (i.e., use "lag !== null" instead of "lag != null") so the null/undefined check is type-safe; keep the existing strict check for className (lag === 0) as-is.
🧹 Nitpick comments (6)
10-databases/visualizer/style.css (1)
30-32: Avoid a page-level 900px minimum width.Line 31 hard-locks the visualizer to desktop width, so the lab overflows on smaller laptop/tablet windows and when users keep docs side-by-side. Prefer responsive breakpoints on
.layoutand.sidebarinstead of a globalmin-width.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/style.css` around lines 30 - 32, Remove the global html { min-width: 900px; } rule and implement responsive breakpoints on the layout components instead: replace the page-level constraint with CSS media queries targeting the .layout and .sidebar selectors so they adapt (collapse, stack, or scroll) at smaller widths; update the styles for .layout and .sidebar to set appropriate min-width / flex-basis / max-width behavior per breakpoint to prevent overflow while preserving the desktop layout.10-databases/visualizer/LAB-VISUALIZER.md (1)
429-452: Replace bold pseudo-headings with real markdown structure.These bold lead-ins are functioning as section headers, which makes the document harder to scan in generated outlines and breaks the markdown style rule. As per coding guidelines,
No bold text should be used as headings in markdown files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/LAB-VISUALIZER.md` around lines 429 - 452, Replace the bold pseudo-headings (e.g., "**Check which students are enrolled in which courses:**", "**Compare primary vs replica data:**", "**Test isolation by running concurrent queries:**", "**Reset the database:**") with proper Markdown headings (e.g., "## Check which students are enrolled in which courses", "## Compare primary vs replica data", etc.), remove trailing colons from the heading lines, and ensure the code block and subsequent instructions remain under those headings so the document outline and generated TOC reflect real sections.10-databases/visualizer/app.js (1)
87-99: API helpers lack error handling for network failures.If
fetchthrows (network error, CORS issue), the promise rejects and propagates uncaught. Callers likedoReplicationdon't catch this, which could leave the UI in a broken state withanimating = true.Proposed improvement
async function apiPost(url, body) { - const res = await fetch(url, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify(body), - }); - return res.json(); + try { + const res = await fetch(url, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(body), + }); + return res.json(); + } catch (err) { + return { error: `Network error: ${err.message}` }; + } } async function apiGet(url) { - const res = await fetch(url); - return res.json(); + try { + const res = await fetch(url); + return res.json(); + } catch (err) { + return { error: `Network error: ${err.message}` }; + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/app.js` around lines 87 - 99, apiPost and apiGet currently call fetch without try/catch or checking response.ok, so network errors or non-2xx responses can reject or return invalid JSON and callers like doReplication (which sets animating = true) never get a controlled error; wrap fetch in try/catch inside apiPost and apiGet, check res.ok and throw or return a standardized error object when not ok, and on network exceptions return/throw a clear error (include the original error message) so callers can catch it and reset animating; update references to apiPost and apiGet callers (e.g., doReplication) to handle the thrown error or error object accordingly.10-databases/visualizer/server.py (2)
372-380: Multi-statement detection can be bypassed with semicolons inside string literals.The check
";" in query.rstrip(";")would reject a legitimate query likeSELECT * FROM students WHERE name = 'O;Brien'. Conversely, a malicious user familiar with MySQL quirks might find edge cases. For an educational lab with a limited audience this is acceptable, but worth noting.A more robust approach would be to use a SQL parser or limit the console to read-only operations on the replica. However, for a lab environment the current protection is pragmatic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/server.py` around lines 372 - 380, The semicolon check in the multi-statement rejection (the block that inspects query and ALLOWED_SQL) can be bypassed by semicolons inside string literals; replace the simple `";" in query.rstrip(";")` check with a scan that ignores semicolons inside single- or double-quoted strings (or strip/replace string literal contents first) so only semicolons outside string literals count as statement separators, then continue to rstrip a trailing semicolon and perform the ALLOWED_SQL check using first_word = query.split()[0].upper().
358-362: Consider whetherSETshould be in the allowlist.
SETallows changing session variables (e.g.,SET autocommit=0,SET sql_mode=...), which could affect subsequent queries in unexpected ways. For a transient lab this is low-risk, but removing it would tighten the surface area.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/server.py` around lines 358 - 362, ALLOWED_SQL currently includes "SET", which permits changing session variables; to tighten security remove "SET" from the ALLOWED_SQL frozenset in server.py (symbol: ALLOWED_SQL) or instead replace it with a stricter allowance rule that parses the first token and only permits specific safe SET variants (e.g., "SET NAMES" or "SET SESSION TRANSACTION ISOLATION LEVEL") by handling SET specially before the general ALLOWED_SQL check. Update any tests or comments referencing ALLOWED_SQL accordingly.10-databases/cassandra/cloudformation.yaml (1)
20-25: Security group opens SSH to 0.0.0.0/0 — acceptable for short-lived lab but worth documenting.Static analysis flagged unrestricted SSH ingress (AWS-0107). For a transient educational lab this is pragmatic, but consider adding a comment in the template or lab instructions noting that students should delete the stack promptly after completing the lab.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/cassandra/cloudformation.yaml` around lines 20 - 25, The SecurityGroupIngress rule allowing tcp port 22 from 0.0.0.0/0 (the SecurityGroupIngress block, FromPort: 22, ToPort: 22, CidrIp: 0.0.0.0/0) should be documented as a deliberate, short-lived allowance for the lab; add a clear comment in the CloudFormation template near that SecurityGroupIngress block stating this is for transient educational use and instructing students/operators to delete the stack immediately after the lab (or replace the CIDR with a restricted range/authorized IP if used long-term).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.pre-commit-config.yaml:
- Around line 13-14: The check-yaml exclusion currently skips all Module 10
CloudFormation templates via the pattern "10-databases/.*/cloudformation\.yaml";
remove that pattern (or replace it with a narrowly scoped exception for the
specific parser-incompatible file) from the exclusions in
.pre-commit-config.yaml so that all Module 10 (*.yml/*.yaml) CloudFormation
templates are validated by check-yaml; if you must keep an exclusion, document
the exact filename and reason and use that exact path instead of the broad
"10-databases/.*/cloudformation\.yaml" pattern.
In `@10-databases/cassandra/docker-compose.yml`:
- Around line 7-12: The docker-compose Cassandra services currently set
CASSANDRA_CLUSTER_NAME/CASSANDRA_DC/CASSANDRA_RACK/CASSANDRA_SEEDS/CASSANDRA_ENDPOINT_SNITCH
but do not constrain JVM memory, which causes OOM on Free Tier; either add
explicit low-memory JVM env vars MAX_HEAP_SIZE=128M and HEAP_NEWSIZE=32M to each
Cassandra service (e.g., the cass1, cass2, cass3 service blocks that define
CASSANDRA_* env vars) so each container uses small heaps, or alternatively add a
short note in the compose file documenting the minimum host requirement (e.g.,
t2.small or separate instances) if you choose not to lower the heap; update the
environment sections for the three node services to include those MAX_HEAP_SIZE
and HEAP_NEWSIZE values.
In `@10-databases/cassandra/init/schema.cql`:
- Around line 51-71: The seed INSERTs into the students table use uuid() so
every run creates new rows; make seeding idempotent by replacing uuid() calls
with fixed UUID literals and add IF NOT EXISTS to each INSERT (or alternatively
TRUNCATE the students table before seeding in setup.sh) so rerunning setup.sh
does not duplicate rows; update the INSERT statements referencing the students
table and the setup.sh invocation that replays schema.cql accordingly.
In `@10-databases/cassandra/setup.sh`:
- Around line 33-61: The current readiness loop only tests CQL connectivity
(docker exec ... cqlsh) which can pass before nodes have joined the ring; change
the readiness check to verify ring convergence via nodetool status and wait
until each node reports "UN" in the cluster (e.g., use docker exec cassX
nodetool status and grep for its hostname/state or query cass1's nodetool status
until all three nodes are UN), increase the timeout to a safer value (e.g., 300
seconds) and use a slightly larger sleep interval (e.g., 2–5s) to avoid tight
polling; update the loop around cass1 cass2 cass3 (and any messages) so it
breaks only when the node is UN and on timeout prints the nodetool and container
logs for debugging.
In `@10-databases/LAB.md`:
- Around line 54-63: The "Environment Options" section and its table (headers
"Environment Options", rows "Local" and "EC2", and artifacts like `./setup.sh`
and `cloudformation.yaml`) needs EC2 sizing guidance: update this section to
call out that multi-node labs (MongoDB, Cassandra) require larger instances
(recommend minimum t3.medium/t3.large or m5.large with 2+ vCPUs and >=4–8GB RAM
and adequate EBS) or advise users on alternatives (use Local/Docker or run
single-node MySQL first) and add a short note directing low-resource users to
the Local setup or to follow smaller single-node lab options; keep the guidance
brief and place it directly below the existing table.
In `@10-databases/mongodb/cloudformation.yaml`:
- Around line 15-38: Parameterize and harden the EC2 resources: add
CloudFormation Parameters for KeyName, InstanceType and SSHCidrIp (use sensible
placeholders/defaults and descriptions) and replace the hardcoded KeyName value
"vockey" and the SecurityGroupIngress CidrIp 0.0.0.0/0 with references to those
parameters (SSHCidrIp should default to a restricted CIDR or instruct users to
supply their IP/32). In the LabSecurityGroup/SecurityGroupIngress block use the
SSHCidrIp parameter; in LabInstance use the InstanceType and KeyName parameters.
Require IMDSv2 by adding MetadataOptions with HttpTokens: required to
LabInstance, and enforce encrypted root storage by adding BlockDeviceMappings on
LabInstance with Ebs.Encrypted: true (and KmsKeyId if you want CMK support).
Finally, change any user-facing key filename in outputs or user-data SSH
instructions to a generic placeholder like "<your-key-pair.pem>" instead of
"labsuser.pem".
In `@10-databases/mongodb/setup.sh`:
- Around line 61-72: The replica-set wait loop (the while loop using i, docker
exec mongo1 mongosh --eval "rs.status()...") currently falls through on timeout;
update it to detect failure after the loop and abort: after the loop, test
whether a PRIMARY was found (re-run the same rs.status() check or set a flag
inside the loop), and if not, print a clear error like "ERROR: replica set
primary not elected after timeout" to stderr and exit non-zero (exit 1) so the
script does not print "Environment ready" on failure.
In `@10-databases/mysql/cloudformation.yaml`:
- Around line 15-38: The template currently hardcodes sensitive/academy-specific
values and insecure defaults: replace the SSH output reference to "labsuser.pem"
with a placeholder like "YOUR_LAB_KEY_FILE"; make KeyName in the LabInstance
properties parameterized (replace the literal vockey with a Parameters
reference) and document it as Academy-only if necessary; change InstanceType in
LabInstance from t3.medium to a Free Tier eligible type such as t2.micro or
t3.micro (parameterize if you want flexibility); remove the hardcoded
SecurityGroupIngress CidrIp of 0.0.0.0/0 and instead reference a new parameter
(e.g., AllowedSSHRange or StudentIpCidr) to restrict SSH; and add
MetadataOptions to LabInstance to require IMDSv2 plus BlockDeviceMappings to set
the root volume to encrypted. Update any Outputs that mention labsuser.pem to
use the YOUR_LAB_KEY_FILE placeholder or the KeyName parameter.
In `@10-databases/README.md`:
- Around line 175-180: The README currently presents "BASE" as the definitive
NoSQL trade-off and contrasts it with "ACID"; update the wording around the
"BASE" and "ACID" bullets to qualify consistency/transaction guarantees as a
spectrum rather than a binary (e.g., keep the "ACID" bullet but remove or
rephrase "NoSQL = BASE, not ACID"), explicitly note that some NoSQL systems like
MongoDB can support ACID transactions, and adjust corresponding text in the
nearby section referenced (lines around "BASE" and also the content at 212-218)
so the materials accurately state that different databases provide differing
levels of atomicity/consistency/isolation/durability rather than implying all
NoSQL systems lack ACID.
In `@10-databases/visualizer/Dockerfile`:
- Around line 1-11: The container runs the HTTP server as root (server.py binds
0.0.0.0:8080 and Dockerfile uses root at runtime); create a non-root user and
drop privileges before starting the app: add steps to create a dedicated
user/group (e.g., "app"), chown the application files and any writable dirs,
switch to that user with the Dockerfile USER directive before CMD, and ensure
the server process runs under that user rather than root so the exposed service
is not running with container root privileges.
In `@10-databases/visualizer/setup.sh`:
- Around line 48-56: The script currently runs START REPLICA and prints
"Replication configured." without verifying replication state; modify the
setup.sh sequence after START REPLICA to poll the replica status (using mysql -e
"SHOW REPLICA STATUS\G" or SHOW SLAVE STATUS\G) and check fields
Replica_IO_Running and Replica_SQL_Running are "Yes" and Last_SQL_Error is empty
(or Seconds_Behind_Master is not NULL) with a short retry loop and timeout; only
print "Replication configured." when the checks pass, and print/log a
descriptive error and exit nonzero if the timeout is reached or errors are
reported by the replica.
---
Minor comments:
In `@10-databases/cassandra/cloudformation.yaml`:
- Line 30: The template currently hardcodes InstanceType: t3.large which is not
AWS Free Tier eligible; update the CloudFormation to either use a Free Tier
instance (e.g., t3.micro) or parameterize the instance type so labs default to a
free-tier-safe value and allow overriding for higher-memory needs—adjust the
resource referencing InstanceType (the InstanceType property in the
EC2/NodeGroup resource) and add a short lab doc note/warning about cost and a
suggested larger-instance alternative for realistic 3-node Cassandra testing.
In `@10-databases/cassandra/setup.sh`:
- Around line 65-67: The schema load hides cqlsh errors by redirecting stderr in
the command 'docker exec cass1 cqlsh -f /tmp/schema.cql 2>/dev/null'; remove the
'2>/dev/null' redirection (or alternatively redirect only stdout) so cqlsh
stderr is printed during 'docker exec cass1 cqlsh -f /tmp/schema.cql', ensuring
any failure messages are visible when the script stops under set -e.
In `@10-databases/mongodb/init/rs-init.js`:
- Around line 31-32: The current line declares a new const db that shadows the
global db ("const db = db.getSiblingDB('university')"); change this to assign to
the existing global db (use db = db.getSiblingDB("university")) or, if you need
both databases, rename the variable to a distinct identifier (e.g., universityDb
= db.getSiblingDB("university")) and update subsequent references; update the
line using the getSiblingDB call accordingly to avoid shadowing.
In `@10-databases/mongodb/LAB-MONGODB.md`:
- Line 299: The troubleshooting row that tells users to "Connect to mongo1 (the
initial primary)" is unsafe because after an election mongo1 may not be primary;
update the guidance for the "MongoServerError: not primary" entry to instruct
users to identify and connect to the current primary (e.g., using
rs.status()/rs.isMaster()/isWritablePrimary checks or the mongo shell's primary
discovery) instead of hardcoding "mongo1", and replace the literal reference to
"mongo1" with a note to connect to the node reported as PRIMARY by the replica
set status.
- Around line 115-120: The secondary-read example can crash if the replica
hasn't applied the insert because zoe will be null; update the mongosh snippet
that calls db.getMongo().setReadPref('secondary') and db.students.findOne({
name: 'Zoe Adams' }) to guard the result: check the zoe variable for
null/undefined and only access zoe.name and zoe.major when present, otherwise
print a clear message indicating the document wasn't found on the secondary
(replication lag) so the step demonstrates lag instead of throwing.
In `@10-databases/mysql/cleanup.sh`:
- Around line 10-14: The cleanup script currently swallows errors for the
commands "docker compose down -v --remove-orphans" and "docker volume rm
mysql_primary-data mysql_replica-data" by redirecting stderr and using "||
true", causing a misleading "Cleanup complete" message; change each command to
preserve idempotency but detect failures (remove "2>/dev/null || true"), capture
the exit status, and if non‑zero emit a clear warning (e.g., "Warning: docker
compose down failed: <error>" / "Warning: docker volume rm failed: <error>")
while still allowing the script to continue, so students see when resources
weren’t removed; reference the two commands above when making the changes.
In `@10-databases/README.md`:
- Around line 21-30: Update the repo-relative paths so the runnable steps work
from the repository root: replace the `cd visualizer` step with `cd
10-databases/visualizer` (and similarly update the `./setup.sh` invocation to
reference the setup script under 10-databases/visualizer), change the
`visualizer/cloudformation.yaml` reference to
`10-databases/visualizer/cloudformation.yaml`, and update the `LAB.md` link to
point to the 10-databases copy; also make the same fixes for the other
occurrences noted (the other references to `visualizer/...` in the file) so all
path references are repo-root-relative.
In `@10-databases/visualizer/app.js`:
- Around line 236-237: Replace the loose null check in the DOM update for
$('#stat-lag').textContent by using strict inequality: change the comparison in
the block that updates $('#stat-lag').textContent to use !== null (i.e., use
"lag !== null" instead of "lag != null") so the null/undefined check is
type-safe; keep the existing strict check for className (lag === 0) as-is.
In `@10-databases/visualizer/cloudformation.yaml`:
- Line 35: The CloudFormation template currently sets the EC2 InstanceType to
t3.medium which is not Free Tier eligible; update the InstanceType value used in
the template (the InstanceType property for the visualizer/DB instance) to a
Free Tier option such as t3.micro (or parameterize it and default to t3.micro)
and then test the MySQL primary-replica plus visualizer workload for
performance; if t3.micro cannot sustain the workload, add a short comment or
README entry documenting expected costs and recommending an appropriate instance
type (e.g., t3.small/t3.medium) for non-free-tier usage.
In `@10-databases/visualizer/index.html`:
- Around line 271-273: The SQL input with id "console-input" is placeholder-only
and missing an accessible name; add an explicit label (either a visually hidden
<label for="console-input">SQL query</label> placed before the input or an
aria-label/aria-labelledby attribute on the input, e.g., aria-label="SQL query
input") so screen readers can identify the field, and ensure the label text
clearly describes purpose (e.g., "SQL query" or "Console SQL input").
In `@10-databases/visualizer/LAB-VISUALIZER.md`:
- Around line 69-76: The link in Step 2 currently points to the raw `main`
branch URL for cloudformation.yaml which can drift; update the step to reference
the versioned/local template `10-databases/visualizer/cloudformation.yaml` from
the same checkout or tag the learner is using (or provide an explicit tag/commit
SHA) instead of the
`https://raw.githubusercontent.com/.../main/10-databases/visualizer/cloudformation.yaml`
URL so the EC2 setup is reproducible.
---
Nitpick comments:
In `@10-databases/cassandra/cloudformation.yaml`:
- Around line 20-25: The SecurityGroupIngress rule allowing tcp port 22 from
0.0.0.0/0 (the SecurityGroupIngress block, FromPort: 22, ToPort: 22, CidrIp:
0.0.0.0/0) should be documented as a deliberate, short-lived allowance for the
lab; add a clear comment in the CloudFormation template near that
SecurityGroupIngress block stating this is for transient educational use and
instructing students/operators to delete the stack immediately after the lab (or
replace the CIDR with a restricted range/authorized IP if used long-term).
In `@10-databases/visualizer/app.js`:
- Around line 87-99: apiPost and apiGet currently call fetch without try/catch
or checking response.ok, so network errors or non-2xx responses can reject or
return invalid JSON and callers like doReplication (which sets animating = true)
never get a controlled error; wrap fetch in try/catch inside apiPost and apiGet,
check res.ok and throw or return a standardized error object when not ok, and on
network exceptions return/throw a clear error (include the original error
message) so callers can catch it and reset animating; update references to
apiPost and apiGet callers (e.g., doReplication) to handle the thrown error or
error object accordingly.
In `@10-databases/visualizer/LAB-VISUALIZER.md`:
- Around line 429-452: Replace the bold pseudo-headings (e.g., "**Check which
students are enrolled in which courses:**", "**Compare primary vs replica
data:**", "**Test isolation by running concurrent queries:**", "**Reset the
database:**") with proper Markdown headings (e.g., "## Check which students are
enrolled in which courses", "## Compare primary vs replica data", etc.), remove
trailing colons from the heading lines, and ensure the code block and subsequent
instructions remain under those headings so the document outline and generated
TOC reflect real sections.
In `@10-databases/visualizer/server.py`:
- Around line 372-380: The semicolon check in the multi-statement rejection (the
block that inspects query and ALLOWED_SQL) can be bypassed by semicolons inside
string literals; replace the simple `";" in query.rstrip(";")` check with a scan
that ignores semicolons inside single- or double-quoted strings (or
strip/replace string literal contents first) so only semicolons outside string
literals count as statement separators, then continue to rstrip a trailing
semicolon and perform the ALLOWED_SQL check using first_word =
query.split()[0].upper().
- Around line 358-362: ALLOWED_SQL currently includes "SET", which permits
changing session variables; to tighten security remove "SET" from the
ALLOWED_SQL frozenset in server.py (symbol: ALLOWED_SQL) or instead replace it
with a stricter allowance rule that parses the first token and only permits
specific safe SET variants (e.g., "SET NAMES" or "SET SESSION TRANSACTION
ISOLATION LEVEL") by handling SET specially before the general ALLOWED_SQL
check. Update any tests or comments referencing ALLOWED_SQL accordingly.
In `@10-databases/visualizer/style.css`:
- Around line 30-32: Remove the global html { min-width: 900px; } rule and
implement responsive breakpoints on the layout components instead: replace the
page-level constraint with CSS media queries targeting the .layout and .sidebar
selectors so they adapt (collapse, stack, or scroll) at smaller widths; update
the styles for .layout and .sidebar to set appropriate min-width / flex-basis /
max-width behavior per breakpoint to prevent overflow while preserving the
desktop layout.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 55d5d1d5-067f-4801-b04c-94240910a10f
📒 Files selected for processing (32)
.pre-commit-config.yaml.secrets.baseline10-databases/LAB.md10-databases/README.md10-databases/cassandra/LAB-CASSANDRA.md10-databases/cassandra/cleanup.sh10-databases/cassandra/cloudformation.yaml10-databases/cassandra/docker-compose.yml10-databases/cassandra/init/schema.cql10-databases/cassandra/setup.sh10-databases/mongodb/LAB-MONGODB.md10-databases/mongodb/cleanup.sh10-databases/mongodb/cloudformation.yaml10-databases/mongodb/docker-compose.yml10-databases/mongodb/init/rs-init.js10-databases/mongodb/setup.sh10-databases/mysql/LAB-MYSQL.md10-databases/mysql/cleanup.sh10-databases/mysql/cloudformation.yaml10-databases/mysql/docker-compose.yml10-databases/mysql/init/primary-init.sql10-databases/mysql/setup.sh10-databases/visualizer/Dockerfile10-databases/visualizer/LAB-VISUALIZER.md10-databases/visualizer/app.js10-databases/visualizer/cleanup.sh10-databases/visualizer/cloudformation.yaml10-databases/visualizer/docker-compose.yml10-databases/visualizer/index.html10-databases/visualizer/server.py10-databases/visualizer/setup.sh10-databases/visualizer/style.css
| 09-distributed-file-systems/cloudformation\.yaml| | ||
| 10-databases/.*/cloudformation\.yaml |
There was a problem hiding this comment.
Do not exclude Module 10 CloudFormation templates from YAML validation.
Line 13–Line 14 broadens check-yaml exclusions to skip all 10-databases/*/cloudformation.yaml, which removes syntax validation for newly added infra templates. Keep these files validated and only carve out exclusions for parser-incompatible edge cases with clear justification.
Suggested fix
- id: check-yaml
exclude: |
(?x)^(
01-introduction-horizontal-scalability/ecsdemo-.*/kubernetes/.*|
01-introduction-horizontal-scalability/ecsdemo-.*/copilot/.*|
08-cloud-networking-vpc/cloudformation/.*|
- 09-distributed-file-systems/cloudformation\.yaml|
- 10-databases/.*/cloudformation\.yaml
+ 09-distributed-file-systems/cloudformation\.yaml
)$Based on learnings: Applies to .yml,.yaml : All YAML files must pass check-yaml validation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.pre-commit-config.yaml around lines 13 - 14, The check-yaml exclusion
currently skips all Module 10 CloudFormation templates via the pattern
"10-databases/.*/cloudformation\.yaml"; remove that pattern (or replace it with
a narrowly scoped exception for the specific parser-incompatible file) from the
exclusions in .pre-commit-config.yaml so that all Module 10 (*.yml/*.yaml)
CloudFormation templates are validated by check-yaml; if you must keep an
exclusion, document the exact filename and reason and use that exact path
instead of the broad "10-databases/.*/cloudformation\.yaml" pattern.
| environment: | ||
| CASSANDRA_CLUSTER_NAME: university-cluster | ||
| CASSANDRA_DC: dc1 | ||
| CASSANDRA_RACK: rack1 | ||
| CASSANDRA_SEEDS: cass1 | ||
| CASSANDRA_ENDPOINT_SNITCH: GossipingPropertyFileSnitch |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For the official Docker image cassandra:4.1, what JVM heap settings are used by default, and is a 3-node cluster expected to fit on an AWS Free Tier EC2 instance with 1 GiB RAM?
💡 Result:
The official Docker image cassandra:4.1 uses dynamic JVM heap settings calculated by cassandra-env.sh based on available container RAM. Heap sizes are set to max(min(1/2 RAM, 1024MB), min(1/4 RAM, 8192MB)) for MAX_HEAP_SIZE, with HEAP_NEWSIZE (young gen) as min(250MB per core capped by cores available, 1/4 heap). These are passed as -Xms/-Xmx/-Xmn if not statically defined in jvm-server.options (which has no static heap settings by default). In Docker without memory limits, it detects host RAM and sets large heaps (e.g., 1GB+). With limits like --memory=1g, it uses container-visible RAM (~1GB), yielding ~512MB heap. A 3-node Cassandra cluster on a single AWS Free Tier t2.micro EC2 (1 GiB RAM, ~0.9GB usable) will not fit. Even with custom low settings (e.g., MAX_HEAP_SIZE=128M HEAP_NEWSIZE=32M per node via env vars), total JVM heap alone is ~1GB+, exceeding RAM after OS/Docker overhead (~200-300MB). Sources confirm Cassandra struggles below ~300-500MB per instance; multi-node on 1GB leads to OOM (issue #277). Use separate instances or lighter DB.
Citations:
- 1: https://hub.docker.com/_/cassandra
- 2: https://cassandra.apache.org/doc/4.1/cassandra/configuration/cass_env_sh_file.html
- 3: Cassandra doesn't run with a low memory on docker docker-library/cassandra#277
- 4: https://github.com/docker-library/cassandra/blob/f304ac2a02e00150b7e6629a40fcbd2891a0ba62/4.1/Dockerfile
- 5: https://github.com/zalando-stups/planb-cassandra/blob/master/Dockerfiles/Cassandra-4/jvm-server.options
- 6: https://apache.googlesource.com/cassandra/+/refs/tags/cassandra-4.1.0/conf/cassandra-env.sh
- 7: https://github.com/apache/cassandra/blob/trunk/conf/cassandra-env.sh
Free Tier instance configuration is insufficient for this 3-node cluster.
The official cassandra:4.1 image uses dynamic JVM heap sizing based on available container memory. On an AWS Free Tier t2.micro instance (1 GiB RAM, ~0.9 GB usable after OS overhead), a 3-node Cassandra cluster will exhaust available memory during bootstrapping. Even with explicit low-memory settings (e.g., MAX_HEAP_SIZE=128M HEAP_NEWSIZE=32M per node), the combined JVM heap footprint (~1 GB+) exceeds available RAM, causing OOM failures before students can run the exercises.
Either document the minimum instance requirement (at least t2.small or separate instances) or configure explicit low-memory mode (e.g., MAX_HEAP_SIZE=128M, HEAP_NEWSIZE=32M) for lines 8, 26, and 47 to make this viable for Free Tier learners.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/cassandra/docker-compose.yml` around lines 7 - 12, The
docker-compose Cassandra services currently set
CASSANDRA_CLUSTER_NAME/CASSANDRA_DC/CASSANDRA_RACK/CASSANDRA_SEEDS/CASSANDRA_ENDPOINT_SNITCH
but do not constrain JVM memory, which causes OOM on Free Tier; either add
explicit low-memory JVM env vars MAX_HEAP_SIZE=128M and HEAP_NEWSIZE=32M to each
Cassandra service (e.g., the cass1, cass2, cass3 service blocks that define
CASSANDRA_* env vars) so each container uses small heaps, or alternatively add a
short note in the compose file documenting the minimum host requirement (e.g.,
t2.small or separate instances) if you choose not to lower the heap; update the
environment sections for the three node services to include those MAX_HEAP_SIZE
and HEAP_NEWSIZE values.
| -- Seed students | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Computer Science', uuid(), 'Alice Johnson', 'alice@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Mathematics', uuid(), 'Bob Smith', 'bob@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Computer Science', uuid(), 'Carol Davis', 'carol@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Physics', uuid(), 'David Lee', 'david@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Computer Science', uuid(), 'Eva Martinez', 'eva@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Mathematics', uuid(), 'Frank Wilson', 'frank@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Biology', uuid(), 'Grace Kim', 'grace@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Computer Science', uuid(), 'Henry Brown', 'henry@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Physics', uuid(), 'Iris Chen', 'iris@university.edu'); | ||
| INSERT INTO students (major, student_id, name, email) | ||
| VALUES ('Mathematics', uuid(), 'Jack Taylor', 'jack@university.edu'); |
There was a problem hiding this comment.
Make the student seed idempotent.
setup.sh replays this file on every run, and uuid() changes the primary key each time. Re-running setup adds another 10 student rows, so the lab no longer matches its documented counts. Use fixed UUID literals with IF NOT EXISTS, or clear the table before reseeding.
Based on learnings, "Lab instructions must work from a clean environment - flag hardcoded paths, missing prerequisites, or steps that assume prior state."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/cassandra/init/schema.cql` around lines 51 - 71, The seed
INSERTs into the students table use uuid() so every run creates new rows; make
seeding idempotent by replacing uuid() calls with fixed UUID literals and add IF
NOT EXISTS to each INSERT (or alternatively TRUNCATE the students table before
seeding in setup.sh) so rerunning setup.sh does not duplicate rows; update the
INSERT statements referencing the students table and the setup.sh invocation
that replays schema.cql accordingly.
| # Start Cassandra cluster (nodes start sequentially due to depends_on) | ||
| echo "Starting 3-node Cassandra cluster..." | ||
| echo " (This takes 2-4 minutes -- each node must join the ring)" | ||
| docker compose up -d | ||
|
|
||
| # Wait for all nodes to be healthy | ||
| for node in cass1 cass2 cass3; do | ||
| echo "" | ||
| echo "Waiting for $node to join the cluster..." | ||
| i=0 | ||
| while [ "$i" -lt 180 ]; do | ||
| i=$((i + 1)) | ||
| if docker exec "$node" cqlsh -e "SELECT now() FROM system.local" > /dev/null 2>&1; then | ||
| echo " $node: ready" | ||
| break | ||
| fi | ||
| if [ "$i" -eq 180 ]; then | ||
| echo "ERROR: $node did not start within 3 minutes." | ||
| docker compose logs "$node" | tail -20 | ||
| exit 1 | ||
| fi | ||
| sleep 1 | ||
| done | ||
| done | ||
|
|
||
| # Check cluster status | ||
| echo "" | ||
| echo "Cluster status:" | ||
| docker exec cass1 nodetool status 2>/dev/null | grep -E "^(UN|DN|Datacenter|=)" || true |
There was a problem hiding this comment.
Wait for ring convergence, not just cqlsh readiness.
This loop only proves each node is accepting CQL. Cassandra can do that before the node is fully UN in the ring, so the lab can seed against a partially formed cluster, and the 180-second cap is tight on slower or low-memory hosts.
🛠️ Suggested hardening
+# Give slower hosts a little more room.
+startup_timeout_seconds="${CASSANDRA_STARTUP_TIMEOUT_SECONDS:-420}"
+
# Wait for all nodes to be healthy
for node in cass1 cass2 cass3; do
echo ""
echo "Waiting for $node to join the cluster..."
i=0
- while [ "$i" -lt 180 ]; do
+ while [ "$i" -lt "$startup_timeout_seconds" ]; do
i=$((i + 1))
if docker exec "$node" cqlsh -e "SELECT now() FROM system.local" > /dev/null 2>&1; then
echo " $node: ready"
break
fi
- if [ "$i" -eq 180 ]; then
- echo "ERROR: $node did not start within 3 minutes."
+ if [ "$i" -eq "$startup_timeout_seconds" ]; then
+ echo "ERROR: $node did not start within ${startup_timeout_seconds} seconds."
docker compose logs "$node" | tail -20
exit 1
fi
sleep 1
done
done
+
+echo ""
+echo "Waiting for the ring to report 3 UN nodes..."
+i=0
+while [ "$i" -lt "$startup_timeout_seconds" ]; do
+ i=$((i + 1))
+ if [ "$(docker exec cass1 nodetool status 2>/dev/null | grep -c '^UN')" -eq 3 ]; then
+ break
+ fi
+ if [ "$i" -eq "$startup_timeout_seconds" ]; then
+ echo "ERROR: Cassandra ring did not stabilize in time."
+ docker exec cass1 nodetool status || true
+ exit 1
+ fi
+ sleep 1
+doneAs per coding guidelines, "Review lab content for technical accuracy, clear step-by-step instructions, and educational value. Flag any steps that may not work in a standard AWS Free Tier environment."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/cassandra/setup.sh` around lines 33 - 61, The current readiness
loop only tests CQL connectivity (docker exec ... cqlsh) which can pass before
nodes have joined the ring; change the readiness check to verify ring
convergence via nodetool status and wait until each node reports "UN" in the
cluster (e.g., use docker exec cassX nodetool status and grep for its
hostname/state or query cass1's nodetool status until all three nodes are UN),
increase the timeout to a safer value (e.g., 300 seconds) and use a slightly
larger sleep interval (e.g., 2–5s) to avoid tight polling; update the loop
around cass1 cass2 cass3 (and any messages) so it breaks only when the node is
UN and on timeout prints the nodetool and container logs for debugging.
| ## Environment Options | ||
|
|
||
| Each lab supports two environments: | ||
|
|
||
| | Environment | What You Need | Setup | | ||
| | --- | --- | --- | | ||
| | **Local** | Docker Desktop | `./setup.sh` in the lab directory | | ||
| | **EC2** | Browser + SSH | Upload `cloudformation.yaml` via AWS Console | | ||
|
|
||
| Run one lab at a time to avoid port conflicts and resource contention. |
There was a problem hiding this comment.
Add EC2 sizing guidance for the multi-node labs.
This table implies every lab is fine on a generic EC2 host, but the MongoDB and Cassandra labs run multiple database nodes and are much heavier than a standard free-tier-style instance. Call out minimum sizing here, or steer low-resource users to local setup / MySQL first.
📝 Suggested fix
| **Local** | Docker Desktop | `./setup.sh` in the lab directory |
| **EC2** | Browser + SSH | Upload `cloudformation.yaml` via AWS Console |
+> Note: the MongoDB and Cassandra labs run multiple database nodes and typically need more memory than a standard AWS Free Tier-style instance.
+> If you're on a low-resource host, prefer local Docker or start with the MySQL lab first.
+
Run one lab at a time to avoid port conflicts and resource contention.As per coding guidelines, "Review lab content for technical accuracy, clear step-by-step instructions, and educational value. Flag any steps that may not work in a standard AWS Free Tier environment."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/LAB.md` around lines 54 - 63, The "Environment Options" section
and its table (headers "Environment Options", rows "Local" and "EC2", and
artifacts like `./setup.sh` and `cloudformation.yaml`) needs EC2 sizing
guidance: update this section to call out that multi-node labs (MongoDB,
Cassandra) require larger instances (recommend minimum t3.medium/t3.large or
m5.large with 2+ vCPUs and >=4–8GB RAM and adequate EBS) or advise users on
alternatives (use Local/Docker or run single-node MySQL first) and add a short
note directing low-resource users to the Local setup or to follow smaller
single-node lab options; keep the guidance brief and place it directly below the
existing table.
| docker exec mongo1 mongosh --quiet --file /tmp/rs-init.js 2>/dev/null || true | ||
|
|
||
| # Wait for replica set to stabilize | ||
| echo " Waiting for primary election..." | ||
| i=0 | ||
| while [ "$i" -lt 30 ]; do | ||
| i=$((i + 1)) | ||
| if docker exec mongo1 mongosh --quiet --eval "rs.status().members.find(m => m.stateStr === 'PRIMARY') ? 'ok' : ''" 2>/dev/null | grep -q "ok"; then | ||
| break | ||
| fi | ||
| sleep 1 | ||
| done |
There was a problem hiding this comment.
Abort if the replica set never elects a PRIMARY.
If replica-set init never succeeds, this loop just falls through and the script still prints Environment ready. That leaves students in a broken lab with no clear failure.
🛠️ Suggested timeout handling
docker exec mongo1 mongosh --quiet --file /tmp/rs-init.js 2>/dev/null || true
# Wait for replica set to stabilize
echo " Waiting for primary election..."
i=0
+primary_ready=0
while [ "$i" -lt 30 ]; do
i=$((i + 1))
if docker exec mongo1 mongosh --quiet --eval "rs.status().members.find(m => m.stateStr === 'PRIMARY') ? 'ok' : ''" 2>/dev/null | grep -q "ok"; then
+ primary_ready=1
break
fi
sleep 1
done
+
+if [ "$primary_ready" -ne 1 ]; then
+ echo "ERROR: Timed out waiting for primary election."
+ docker compose logs mongo1 mongo2 mongo3
+ exit 1
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| docker exec mongo1 mongosh --quiet --file /tmp/rs-init.js 2>/dev/null || true | |
| # Wait for replica set to stabilize | |
| echo " Waiting for primary election..." | |
| i=0 | |
| while [ "$i" -lt 30 ]; do | |
| i=$((i + 1)) | |
| if docker exec mongo1 mongosh --quiet --eval "rs.status().members.find(m => m.stateStr === 'PRIMARY') ? 'ok' : ''" 2>/dev/null | grep -q "ok"; then | |
| break | |
| fi | |
| sleep 1 | |
| done | |
| docker exec mongo1 mongosh --quiet --file /tmp/rs-init.js 2>/dev/null || true | |
| # Wait for replica set to stabilize | |
| echo " Waiting for primary election..." | |
| i=0 | |
| primary_ready=0 | |
| while [ "$i" -lt 30 ]; do | |
| i=$((i + 1)) | |
| if docker exec mongo1 mongosh --quiet --eval "rs.status().members.find(m => m.stateStr === 'PRIMARY') ? 'ok' : ''" 2>/dev/null | grep -q "ok"; then | |
| primary_ready=1 | |
| break | |
| fi | |
| sleep 1 | |
| done | |
| if [ "$primary_ready" -ne 1 ]; then | |
| echo "ERROR: Timed out waiting for primary election." | |
| docker compose logs mongo1 mongo2 mongo3 | |
| exit 1 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/mongodb/setup.sh` around lines 61 - 72, The replica-set wait
loop (the while loop using i, docker exec mongo1 mongosh --eval
"rs.status()...") currently falls through on timeout; update it to detect
failure after the loop and abort: after the loop, test whether a PRIMARY was
found (re-run the same rs.status() check or set a flag inside the loop), and if
not, print a clear error like "ERROR: replica set primary not elected after
timeout" to stderr and exit non-zero (exit 1) so the script does not print
"Environment ready" on failure.
| LabSecurityGroup: | ||
| Type: AWS::EC2::SecurityGroup | ||
| Properties: | ||
| GroupDescription: Lab 10A MySQL - SSH access | ||
| SecurityGroupIngress: | ||
| - IpProtocol: tcp | ||
| FromPort: 22 | ||
| ToPort: 22 | ||
| CidrIp: 0.0.0.0/0 | ||
| Description: SSH access | ||
|
|
||
| LabInstance: | ||
| Type: AWS::EC2::Instance | ||
| Properties: | ||
| InstanceType: t3.medium | ||
| ImageId: !Ref LatestAmiId | ||
| KeyName: vockey | ||
| SecurityGroupIds: | ||
| - !GetAtt LabSecurityGroup.GroupId | ||
| Tags: | ||
| - Key: Name | ||
| Value: lab10a-mysql | ||
| UserData: | ||
| Fn::Base64: | |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n 10-databases/mysql/cloudformation.yamlRepository: gamaware/system-design-course
Length of output: 3128
🌐 Web query:
AWS Free Tier t3.medium EC2 instance eligible
💡 Result:
No, the AWS Free Tier t3.medium EC2 instance is not eligible. Official AWS documentation and Free Tier pages list eligible instances as t3.micro, t3.small, t4g.micro, t4g.small, c7i-flex.large, and m7i-flex.large for accounts created on or after July 15, 2025 (6-month duration), or t2.micro/t3.micro for older accounts (12-month). t3.medium is not included in any list. You can verify using the AWS CLI: aws ec2 describe-instance-types --filters Name=free-tier-eligible,Values=true --query "InstanceTypes[*].[InstanceType]".
Citations:
- 1: https://aws.amazon.com/free/compute/
- 2: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-free-tier-usage.html
- 3: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstanceTypes.html
Remove hardcoded credential references and parameterize Academy-specific values.
This template hardcodes the credential filename labsuser.pem in the SSH output (lines 79–81), which violates the requirement to use placeholders for sensitive values. Replace with YOUR_LAB_KEY_FILE or similar. Additionally:
- Line 31:
KeyName: vockeyis hardcoded and Academy-specific; parameterize or document that this is Academy-only. - Line 29:
t3.mediumis not AWS Free Tier eligible; uset2.microort3.microto align with lab requirements. - Line 23: SSH access to
0.0.0.0/0should be restricted to a parameter for the student's IP or documented safe range. - Add explicit
MetadataOptionsto enforce IMDSv2 andBlockDeviceMappingsfor encrypted root volume.
As per coding guidelines [0-1][0-9]-*/**: Review lab content for technical accuracy and ensure steps work in a standard AWS Free Tier environment.
🧰 Tools
🪛 Trivy (0.69.3)
[error] 26-38: aws_instance should activate session tokens for Instance Metadata Service.
Instance does not require IMDS access to require a token.
Rule: AWS-0028
Resource: 10-databases/mysql/cloudformation.yaml:26-38
(IaC/AWS)
[error] 23-23: Security groups should not allow unrestricted ingress to SSH or RDP from any IP address.
Security group rule allows unrestricted ingress from any IP address.
Rule: AWS-0107
Resource: LabSecurityGroup
(IaC/AWS)
[error] 26-38: Instance with unencrypted block device.
Root block device is not encrypted.
Rule: AWS-0131
Resource: 10-databases/mysql/cloudformation.yaml:26-38
(IaC/AWS)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/mysql/cloudformation.yaml` around lines 15 - 38, The template
currently hardcodes sensitive/academy-specific values and insecure defaults:
replace the SSH output reference to "labsuser.pem" with a placeholder like
"YOUR_LAB_KEY_FILE"; make KeyName in the LabInstance properties parameterized
(replace the literal vockey with a Parameters reference) and document it as
Academy-only if necessary; change InstanceType in LabInstance from t3.medium to
a Free Tier eligible type such as t2.micro or t3.micro (parameterize if you want
flexibility); remove the hardcoded SecurityGroupIngress CidrIp of 0.0.0.0/0 and
instead reference a new parameter (e.g., AllowedSSHRange or StudentIpCidr) to
restrict SSH; and add MetadataOptions to LabInstance to require IMDSv2 plus
BlockDeviceMappings to set the root volume to encrypted. Update any Outputs that
mention labsuser.pem to use the YOUR_LAB_KEY_FILE placeholder or the KeyName
parameter.
| | **Replication** | Copying data from a primary to replicas via binary log | | ||
| | **Read replica** | A read-only copy that offloads SELECT queries from the primary | | ||
| | **Replication lag** | Delay between a write on primary and its appearance on replica | | ||
| | **ACID** | Atomicity, Consistency, Isolation, Durability -- transaction guarantees | | ||
| | **Atomicity** | A transaction either fully succeeds (COMMIT) or fully fails (ROLLBACK) | | ||
| | **BASE** | Basically Available, Soft state, Eventually consistent -- NoSQL trade-off | |
There was a problem hiding this comment.
Please avoid teaching “NoSQL = BASE, not ACID.”
This module includes MongoDB, which supports ACID transactions, so framing BASE as the NoSQL trade-off is misleading. Qualify it as a spectrum of consistency/transaction guarantees instead. As per coding guidelines, [0-1][0-9]-*/**: Review lab content for technical accuracy, clear step-by-step instructions, and educational value.
Also applies to: 212-218
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/README.md` around lines 175 - 180, The README currently presents
"BASE" as the definitive NoSQL trade-off and contrasts it with "ACID"; update
the wording around the "BASE" and "ACID" bullets to qualify
consistency/transaction guarantees as a spectrum rather than a binary (e.g.,
keep the "ACID" bullet but remove or rephrase "NoSQL = BASE, not ACID"),
explicitly note that some NoSQL systems like MongoDB can support ACID
transactions, and adjust corresponding text in the nearby section referenced
(lines around "BASE" and also the content at 212-218) so the materials
accurately state that different databases provide differing levels of
atomicity/consistency/isolation/durability rather than implying all NoSQL
systems lack ACID.
| FROM python:3.13-slim | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| RUN pip install --no-cache-dir pymysql==1.1.1 cryptography==44.0.3 | ||
|
|
||
| COPY . . | ||
|
|
||
| EXPOSE 8080 | ||
|
|
||
| CMD ["python", "server.py"] |
There was a problem hiding this comment.
Drop root before starting the HTTP server.
This image currently runs a publicly exposed process as root. Given 10-databases/visualizer/server.py:495-506 binds 0.0.0.0:8080 and 10-databases/visualizer/docker-compose.yml:49-59 publishes that port, any bug in the request handler lands with full container privileges.
🔒 Proposed hardening
FROM python:3.13-slim
WORKDIR /app
RUN pip install --no-cache-dir pymysql==1.1.1 cryptography==44.0.3
+
+RUN useradd --system --create-home appuser
COPY . .
+
+RUN chown -R appuser:appuser /app
+USER appuser
EXPOSE 8080
CMD ["python", "server.py"]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| FROM python:3.13-slim | |
| WORKDIR /app | |
| RUN pip install --no-cache-dir pymysql==1.1.1 cryptography==44.0.3 | |
| COPY . . | |
| EXPOSE 8080 | |
| CMD ["python", "server.py"] | |
| FROM python:3.13-slim | |
| WORKDIR /app | |
| RUN pip install --no-cache-dir pymysql==1.1.1 cryptography==44.0.3 | |
| RUN useradd --system --create-home appuser | |
| COPY . . | |
| RUN chown -R appuser:appuser /app | |
| USER appuser | |
| EXPOSE 8080 | |
| CMD ["python", "server.py"] |
🧰 Tools
🪛 Trivy (0.69.3)
[error] 1-1: Image user should not be 'root'
Specify at least 1 USER command in Dockerfile with non-root user as argument
Rule: DS-0002
(IaC/Dockerfile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/Dockerfile` around lines 1 - 11, The container runs
the HTTP server as root (server.py binds 0.0.0.0:8080 and Dockerfile uses root
at runtime); create a non-root user and drop privileges before starting the app:
add steps to create a dedicated user/group (e.g., "app"), chown the application
files and any writable dirs, switch to that user with the Dockerfile USER
directive before CMD, and ensure the server process runs under that user rather
than root so the exposed service is not running with container root privileges.
| docker exec mysql-replica mysql -u root -prootpass \ | ||
| -e "STOP REPLICA; CHANGE REPLICATION SOURCE TO SOURCE_HOST='mysql-primary', SOURCE_USER='repl', SOURCE_PASSWORD='replpass', SOURCE_AUTO_POSITION=1, GET_SOURCE_PUBLIC_KEY=1; START REPLICA;" \ | ||
| 2>/dev/null | ||
|
|
||
| docker exec mysql-replica mysql -u root -prootpass \ | ||
| -e "SET GLOBAL read_only = ON; SET GLOBAL super_read_only = ON;" \ | ||
| 2>/dev/null | ||
|
|
||
| echo " Replication configured." |
There was a problem hiding this comment.
Verify replication before declaring the lab ready.
START REPLICA only launches async workers. If the source config is wrong or the replica is still coming up, this script still says “Visualizer ready” even though the replication tab is broken.
🔁 Suggested health check before the success banner
docker exec mysql-replica mysql -u root -prootpass \
-e "STOP REPLICA; CHANGE REPLICATION SOURCE TO SOURCE_HOST='mysql-primary', SOURCE_USER='repl', SOURCE_PASSWORD='replpass', SOURCE_AUTO_POSITION=1, GET_SOURCE_PUBLIC_KEY=1; START REPLICA;" \
2>/dev/null
+echo " Waiting for replica threads..."
+i=0
+replication_ready=0
+while [ "$i" -lt 30 ]; do
+ i=$((i + 1))
+ replica_status=$(docker exec mysql-replica mysql -u root -prootpass -e "SHOW REPLICA STATUS\\G" 2>/dev/null || true)
+ if grep -q "Replica_IO_Running: Yes" <<<"$replica_status" \
+ && grep -q "Replica_SQL_Running: Yes" <<<"$replica_status"; then
+ replication_ready=1
+ break
+ fi
+ sleep 1
+done
+
+if [ "$replication_ready" -ne 1 ]; then
+ echo "ERROR: Replication did not become healthy."
+ exit 1
+fi
+
docker exec mysql-replica mysql -u root -prootpass \
-e "SET GLOBAL read_only = ON; SET GLOBAL super_read_only = ON;" \
2>/dev/null📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| docker exec mysql-replica mysql -u root -prootpass \ | |
| -e "STOP REPLICA; CHANGE REPLICATION SOURCE TO SOURCE_HOST='mysql-primary', SOURCE_USER='repl', SOURCE_PASSWORD='replpass', SOURCE_AUTO_POSITION=1, GET_SOURCE_PUBLIC_KEY=1; START REPLICA;" \ | |
| 2>/dev/null | |
| docker exec mysql-replica mysql -u root -prootpass \ | |
| -e "SET GLOBAL read_only = ON; SET GLOBAL super_read_only = ON;" \ | |
| 2>/dev/null | |
| echo " Replication configured." | |
| docker exec mysql-replica mysql -u root -prootpass \ | |
| -e "STOP REPLICA; CHANGE REPLICATION SOURCE TO SOURCE_HOST='mysql-primary', SOURCE_USER='repl', SOURCE_PASSWORD='replpass', SOURCE_AUTO_POSITION=1, GET_SOURCE_PUBLIC_KEY=1; START REPLICA;" \ | |
| 2>/dev/null | |
| echo " Waiting for replica threads..." | |
| i=0 | |
| replication_ready=0 | |
| while [ "$i" -lt 30 ]; do | |
| i=$((i + 1)) | |
| replica_status=$(docker exec mysql-replica mysql -u root -prootpass -e "SHOW REPLICA STATUS\\G" 2>/dev/null || true) | |
| if grep -q "Replica_IO_Running: Yes" <<<"$replica_status" \ | |
| && grep -q "Replica_SQL_Running: Yes" <<<"$replica_status"; then | |
| replication_ready=1 | |
| break | |
| fi | |
| sleep 1 | |
| done | |
| if [ "$replication_ready" -ne 1 ]; then | |
| echo "ERROR: Replication did not become healthy." | |
| exit 1 | |
| fi | |
| docker exec mysql-replica mysql -u root -prootpass \ | |
| -e "SET GLOBAL read_only = ON; SET GLOBAL super_read_only = ON;" \ | |
| 2>/dev/null | |
| echo " Replication configured." |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/setup.sh` around lines 48 - 56, The script currently
runs START REPLICA and prints "Replication configured." without verifying
replication state; modify the setup.sh sequence after START REPLICA to poll the
replica status (using mysql -e "SHOW REPLICA STATUS\G" or SHOW SLAVE STATUS\G)
and check fields Replica_IO_Running and Replica_SQL_Running are "Yes" and
Last_SQL_Error is empty (or Seconds_Behind_Master is not NULL) with a short
retry loop and timeout; only print "Replication configured." when the checks
pass, and print/log a descriptive error and exit nonzero if the timeout is
reached or errors are reported by the replica.
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
.secrets.baseline (1)
221-255: Triaged demo credentials should be allowlisted before baseline refresh.The new findings on Line 221, Line 230, Line 239, and Line 248 appear to be intentional lab/demo credentials. Keeping them as recurring baseline entries increases scanner noise and can hide real leaks. Please add explicit allowlist annotations (or scoped excludes for these lab fixtures), then regenerate the baseline.
Based on learnings: Update
detect-secretsbaseline for false positives viadetect-secrets scan --update .secrets.baseline.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.secrets.baseline around lines 221 - 255, The .secrets.baseline contains intentional demo credentials for the entries referencing "10-databases/mysql/docker-compose.yml", "10-databases/mysql/setup.sh", "10-databases/visualizer/docker-compose.yml", and "10-databases/visualizer/setup.sh" (see the hashed_secret values e63a0ab2f8e7ecde486b42ebfec16d4434840af4 and 7667ad761bd61125e6eb84aafc63fe3bb914ab5b); update the baseline to explicitly allowlist/scoped-exclude these lab/demo fixtures (or add allowlist annotations for those specific hashed_secret entries) and then regenerate the baseline by running detect-secrets scan --update .secrets.baseline so future scans treat them as known false positives.10-databases/visualizer/app.js (1)
81-81: Clear the remaining ESLint warnings.
pollTimeris never read, andconsoleHistorynever needs reassignment. Dropping the unused interval handle or actually clearing it on teardown, plus changingconsoleHistorytoconst, will keep this file warning-free.Also applies to: 468-468
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@10-databases/visualizer/app.js` at line 81, pollTimer is declared but never read and consoleHistory is reassigned unnecessarily; change consoleHistory to a const and either remove the unused pollTimer declaration or actually store the interval ID into pollTimer when calling setInterval and clear it in the component/page teardown (e.g., in the cleanup or unload handler) so the interval gets cleared; update references to the interval creation/cleanup around pollTimer and the consoleHistory declaration (use const consoleHistory) to eliminate the ESLint warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@10-databases/visualizer/app.js`:
- Around line 87-99: The async helpers apiPost and apiGet and the caller
doExplain fail to unwind UI state on errors and dereference result.steps without
guarding result.error; wrap fetch + res.json calls in try/catch (or add .catch)
and ensure a finally block resets animating and re-enables buttons (the same
cleanup currently only on the happy path), and in doExplain check result.error
before accessing result.steps (handle/log error and return early) so backend
errors or rejected fetches don’t leave the UI stuck; update apiPost/apiGet to
propagate errors cleanly and callers (doExplain and the other flows mentioned)
to handle those errors and run the shared cleanup.
- Around line 156-171: The logEvent function and the sidebar/course-list
renderer currently assign user-derived strings directly into element.innerHTML
(e.g., entry.innerHTML using step.action, step.target, step.result and sidebar
course entries); replace these sinks by creating each span with
document.createElement, set className (e.g.,
'log-time','log-action','log-target','log-result','log-latency'), assign
textContent with the untrusted values (e.g., step.action, step.target,
step.result, `${step.latency_ms.toFixed(1)}ms`), and append them to the entry;
apply the same change to the course list rendering code so all dynamic content
is inserted via textContent/node creation instead of innerHTML.
In `@10-databases/visualizer/server.py`:
- Around line 152-169: The DELETE branch can be a no-op but the code still runs
the UPDATE that decrements courses.enrolled, causing counts to diverge; modify
the logic around the DELETE (the cur.execute("DELETE FROM enrollments ...") and
subsequent affected = cur.rowcount) to check affected == 0 and if so skip the
downstream UPDATE and any insertion into the target course (or perform a
transaction rollback/return early). Concretely: after the DELETE and reading
affected, if affected == 0 append a step_entry indicating "NO ROWS - no-op" and
return/abort the transfer before executing cur.execute("UPDATE courses SET
enrolled = enrolled - 1 ...") or any insert into the target, ensuring the
operations involving from_id, student_id, seq, steps and step_entry are not
performed when no enrollment was removed.
- Around line 336-352: db_reset currently runs on an autocommit connection and
only trims extra rows, leaving the DB in a partial state on errors and unable to
fully restore the original seed rows or access_log index; change db_reset to
open a transactional connection (use get_conn(PRIMARY_HOST) with autocommit
disabled or begin a transaction), perform all DELETE/UPDATE/INSERT/DROP/CREATE
operations inside a try block, and on success commit once at the end and on any
exception rollback and re-raise or return an error; also ensure the handler
restores the exact seed baseline (recreate missing seed rows and recreate
access_log/index if needed rather than only suppressing OperationalError) so
db_reset is a true atomic revert (refer to db_reset, get_conn, the cursor block,
and the contextlib.suppress(pymysql.err.OperationalError) usage to locate where
to implement the transactional commit/rollback and stronger restoration logic).
- Around line 479-484: The current _serve_static method uses startswith to check
the resolved file path against the current working directory which is unsafe;
change the check to compute the realpath of the base directory (e.g., base =
os.path.realpath(os.getcwd())) and use os.path.commonpath([base, resolved]) ==
base to ensure the resolved path is within the base jail before proceeding to
open or serve the file; update the error branch to return a 403 if the
commonpath check fails.
- Around line 19-25: The code currently opens a fresh autocommit connection per
request in get_conn() and sql_exec(), so transaction verbs (START, COMMIT,
ROLLBACK) in ALLOWED_SQL can't work; either remove those verbs from ALLOWED_SQL
or implement per-client session persistence: modify get_conn()/sql_exec() to
reuse a persistent connection per client (e.g., key by session cookie or client
id) stored in an in-memory dict, create connections with autocommit=False, route
SQL requests for a given client to that stored connection, expose a cleanup to
close the connection on session end, and ensure sql_exec() applies
commit/rollback on the stored connection when receiving COMMIT/ROLLBACK
statements; reference functions/variables: get_conn, sql_exec, ALLOWED_SQL.
---
Nitpick comments:
In @.secrets.baseline:
- Around line 221-255: The .secrets.baseline contains intentional demo
credentials for the entries referencing "10-databases/mysql/docker-compose.yml",
"10-databases/mysql/setup.sh", "10-databases/visualizer/docker-compose.yml", and
"10-databases/visualizer/setup.sh" (see the hashed_secret values
e63a0ab2f8e7ecde486b42ebfec16d4434840af4 and
7667ad761bd61125e6eb84aafc63fe3bb914ab5b); update the baseline to explicitly
allowlist/scoped-exclude these lab/demo fixtures (or add allowlist annotations
for those specific hashed_secret entries) and then regenerate the baseline by
running detect-secrets scan --update .secrets.baseline so future scans treat
them as known false positives.
In `@10-databases/visualizer/app.js`:
- Line 81: pollTimer is declared but never read and consoleHistory is reassigned
unnecessarily; change consoleHistory to a const and either remove the unused
pollTimer declaration or actually store the interval ID into pollTimer when
calling setInterval and clear it in the component/page teardown (e.g., in the
cleanup or unload handler) so the interval gets cleared; update references to
the interval creation/cleanup around pollTimer and the consoleHistory
declaration (use const consoleHistory) to eliminate the ESLint warnings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3fb13917-2313-4fb9-8cce-36f05e922c4c
📒 Files selected for processing (7)
.secrets.baseline10-databases/mongodb/init/rs-init.js10-databases/mongodb/setup.sh10-databases/visualizer/app.js10-databases/visualizer/server.py10-databases/visualizer/setup.sheslint.config.js
✅ Files skipped from review due to trivial changes (2)
- eslint.config.js
- 10-databases/mongodb/init/rs-init.js
🚧 Files skipped from review as they are similar to previous changes (2)
- 10-databases/mongodb/setup.sh
- 10-databases/visualizer/setup.sh
| async function apiPost(url, body) { | ||
| const res = await fetch(url, { | ||
| method: 'POST', | ||
| headers: { 'Content-Type': 'application/json' }, | ||
| body: JSON.stringify(body), | ||
| }); | ||
| return res.json(); | ||
| } | ||
|
|
||
| async function apiGet(url) { | ||
| const res = await fetch(url); | ||
| return res.json(); | ||
| } |
There was a problem hiding this comment.
Always unwind UI state in the async flows.
A rejected fetch()/res.json() leaves animating = true and all buttons disabled because cleanup only runs on the happy path. doExplain() also skips the result.error guard and dereferences result.steps unconditionally, so a normal backend error crashes the visualizer before it can recover.
As per coding guidelines, review lab content for technical accuracy, clear step-by-step instructions, and educational value.
Also applies to: 263-311, 315-376, 380-416
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/app.js` around lines 87 - 99, The async helpers
apiPost and apiGet and the caller doExplain fail to unwind UI state on errors
and dereference result.steps without guarding result.error; wrap fetch +
res.json calls in try/catch (or add .catch) and ensure a finally block resets
animating and re-enables buttons (the same cleanup currently only on the happy
path), and in doExplain check result.error before accessing result.steps
(handle/log error and return early) so backend errors or rejected fetches don’t
leave the UI stuck; update apiPost/apiGet to propagate errors cleanly and
callers (doExplain and the other flows mentioned) to handle those errors and run
the shared cleanup.
| function logEvent(step) { | ||
| const log = $('#event-log'); | ||
| const entry = document.createElement('div'); | ||
| entry.className = 'log-entry'; | ||
|
|
||
| const time = new Date().toLocaleTimeString('en-US', { hour12: false }); | ||
| const targetCls = step.target === 'primary' ? 'primary' : 'replica'; | ||
| const resultCls = step.result === 'OK' || step.result === 'FOUND' || | ||
| step.result === 'COMMITTED' ? 'ok' : 'error'; | ||
|
|
||
| entry.innerHTML = | ||
| `<span class="log-time">${time}</span>` + | ||
| `<span class="log-action">${step.action}</span>` + | ||
| `<span class="log-target ${targetCls}">${step.target}</span>` + | ||
| `<span class="log-result ${resultCls}">${step.result}</span>` + | ||
| `<span class="log-latency">${step.latency_ms.toFixed(1)}ms</span>`; |
There was a problem hiding this comment.
Replace the dynamic innerHTML sinks.
logEvent() and the course list both render dynamic strings through innerHTML. Here, Line 540 derives step.action from raw SQL input, and the sidebar renders DB-backed values, so a learner can inject markup back into the page through the built-in console. Build those nodes with textContent instead.
Also applies to: 245-250, 539-543
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/app.js` around lines 156 - 171, The logEvent function
and the sidebar/course-list renderer currently assign user-derived strings
directly into element.innerHTML (e.g., entry.innerHTML using step.action,
step.target, step.result and sidebar course entries); replace these sinks by
creating each span with document.createElement, set className (e.g.,
'log-time','log-action','log-target','log-result','log-latency'), assign
textContent with the untrusted values (e.g., step.action, step.target,
step.result, `${step.latency_ms.toFixed(1)}ms`), and append them to the entry;
apply the same change to the course list rendering code so all dynamic content
is inserted via textContent/node creation instead of innerHTML.
| def get_conn(host): | ||
| """Return a MySQL connection to the specified host.""" | ||
| return pymysql.connect( | ||
| host=host, user=MYSQL_USER, password=MYSQL_PASS, | ||
| database=MYSQL_DB, cursorclass=pymysql.cursors.DictCursor, | ||
| autocommit=True, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd server.pyRepository: gamaware/system-design-course
Length of output: 106
🏁 Script executed:
cat -n 10-databases/visualizer/server.py | head -100Repository: gamaware/system-design-course
Length of output: 3991
🏁 Script executed:
wc -l 10-databases/visualizer/server.pyRepository: gamaware/system-design-course
Length of output: 110
🏁 Script executed:
cat -n 10-databases/visualizer/server.py | sed -n '350,410p'Repository: gamaware/system-design-course
Length of output: 2618
🏁 Script executed:
cat -n 10-databases/visualizer/server.py | sed -n '200,360p'Repository: gamaware/system-design-course
Length of output: 7065
Remove transaction verbs from ALLOWED_SQL or implement per-client session persistence.
sql_exec() creates a fresh connection per request and get_conn() enables autocommit=True, which means START, COMMIT, and ROLLBACK cannot function as learners would expect. Each statement commits immediately, and statements cannot span requests. Learners will be confused when these commands don't behave like transactions. Either remove the transaction verbs (lines 360) or persist a session per client.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/server.py` around lines 19 - 25, The code currently
opens a fresh autocommit connection per request in get_conn() and sql_exec(), so
transaction verbs (START, COMMIT, ROLLBACK) in ALLOWED_SQL can't work; either
remove those verbs from ALLOWED_SQL or implement per-client session persistence:
modify get_conn()/sql_exec() to reuse a persistent connection per client (e.g.,
key by session cookie or client id) stored in an in-memory dict, create
connections with autocommit=False, route SQL requests for a given client to that
stored connection, expose a cleanup to close the connection on session end, and
ensure sql_exec() applies commit/rollback on the stored connection when
receiving COMMIT/ROLLBACK statements; reference functions/variables: get_conn,
sql_exec, ALLOWED_SQL.
| # Delete old enrollment | ||
| t0 = time.perf_counter() | ||
| cur.execute("DELETE FROM enrollments WHERE student_id = %s AND course_id = %s", | ||
| (student_id, from_id)) | ||
| affected = cur.rowcount | ||
| t1 = time.perf_counter() | ||
| steps.append(step_entry(seq, f"DELETE enrollment ({from_course})", "primary", | ||
| "OK" if affected > 0 else "NO ROWS", | ||
| (t1 - t0) * 1000, {"rows_affected": affected})) | ||
| seq += 1 | ||
|
|
||
| # Update from-course count | ||
| t0 = time.perf_counter() | ||
| cur.execute("UPDATE courses SET enrolled = enrolled - 1 WHERE course_id = %s AND enrolled > 0", | ||
| (from_id,)) | ||
| t1 = time.perf_counter() | ||
| steps.append(step_entry(seq, f"UPDATE {from_course} enrolled-1", "primary", "OK", | ||
| (t1 - t0) * 1000)) |
There was a problem hiding this comment.
Rollback when the source enrollment is missing.
If the DELETE touches 0 rows, the code still decrements courses.enrolled and may then insert into the target course, so courses.enrolled can diverge from the actual enrollments table. Treat that branch as a no-op or rollback before updating either course count.
As per coding guidelines, review lab content for technical accuracy, clear step-by-step instructions, and educational value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/server.py` around lines 152 - 169, The DELETE branch
can be a no-op but the code still runs the UPDATE that decrements
courses.enrolled, causing counts to diverge; modify the logic around the DELETE
(the cur.execute("DELETE FROM enrollments ...") and subsequent affected =
cur.rowcount) to check affected == 0 and if so skip the downstream UPDATE and
any insertion into the target course (or perform a transaction rollback/return
early). Concretely: after the DELETE and reading affected, if affected == 0
append a step_entry indicating "NO ROWS - no-op" and return/abort the transfer
before executing cur.execute("UPDATE courses SET enrolled = enrolled - 1 ...")
or any insert into the target, ensuring the operations involving from_id,
student_id, seq, steps and step_entry are not performed when no enrollment was
removed.
| def db_reset(_body): | ||
| """Reset database to initial state.""" | ||
| conn = get_conn(PRIMARY_HOST) | ||
| try: | ||
| with conn.cursor() as cur: | ||
| cur.execute("DELETE FROM enrollments") | ||
| cur.execute("DELETE FROM students WHERE student_id > 10") | ||
| cur.execute("DELETE FROM courses WHERE code NOT IN ('CS101','CS201','MATH101','PHYS101')") | ||
| cur.execute("""UPDATE courses SET enrolled = CASE code | ||
| WHEN 'CS101' THEN 4 WHEN 'CS201' THEN 2 | ||
| WHEN 'MATH101' THEN 2 WHEN 'PHYS101' THEN 2 END""") | ||
| # Re-insert original enrollments | ||
| cur.execute("""INSERT INTO enrollments (student_id, course_id) VALUES | ||
| (1,1),(3,1),(5,1),(8,1),(1,2),(3,2),(2,3),(6,3),(4,4),(9,4)""") | ||
| with contextlib.suppress(pymysql.err.OperationalError): | ||
| cur.execute("DROP INDEX idx_student_resource ON access_log") | ||
| return {"result": "OK", "message": "Database reset to initial state"} |
There was a problem hiding this comment.
db_reset() needs to be a real, atomic reset.
This runs on an autocommit connection, so any mid-reset failure leaves the lab partially mutated. It also only trims extra rows; if the SQL console removed one of the seed rows from 10-databases/mysql/init/primary-init.sql:47-60 or changed access_log, this handler cannot restore the baseline state that the exercises depend on.
As per coding guidelines, review lab content for technical accuracy, clear step-by-step instructions, and educational value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/server.py` around lines 336 - 352, db_reset currently
runs on an autocommit connection and only trims extra rows, leaving the DB in a
partial state on errors and unable to fully restore the original seed rows or
access_log index; change db_reset to open a transactional connection (use
get_conn(PRIMARY_HOST) with autocommit disabled or begin a transaction), perform
all DELETE/UPDATE/INSERT/DROP/CREATE operations inside a try block, and on
success commit once at the end and on any exception rollback and re-raise or
return an error; also ensure the handler restores the exact seed baseline
(recreate missing seed rows and recreate access_log/index if needed rather than
only suppressing OperationalError) so db_reset is a true atomic revert (refer to
db_reset, get_conn, the cursor block, and the
contextlib.suppress(pymysql.err.OperationalError) usage to locate where to
implement the transactional commit/rollback and stronger restoration logic).
| def _serve_static(self, path): | ||
| filename = path.lstrip("/") | ||
| resolved = os.path.realpath(filename) | ||
| if not resolved.startswith(os.path.realpath(os.getcwd())): | ||
| self._send_json({"error": "Forbidden"}, 403) | ||
| return |
There was a problem hiding this comment.
Use commonpath() for the static-file jail.
startswith() on real paths is only a prefix check, so a sibling path such as /app2/... still passes when the base directory is /app. Compare the resolved file path with the base directory via os.path.commonpath() before opening the file.
🔒 Proposed fix
def _serve_static(self, path):
- filename = path.lstrip("/")
- resolved = os.path.realpath(filename)
- if not resolved.startswith(os.path.realpath(os.getcwd())):
+ base = os.path.realpath(os.getcwd())
+ filename = path.lstrip("/")
+ resolved = os.path.realpath(os.path.join(base, filename))
+ if os.path.commonpath([base, resolved]) != base:
self._send_json({"error": "Forbidden"}, 403)
return🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@10-databases/visualizer/server.py` around lines 479 - 484, The current
_serve_static method uses startswith to check the resolved file path against the
current working directory which is unsafe; change the check to compute the
realpath of the base directory (e.g., base = os.path.realpath(os.getcwd())) and
use os.path.commonpath([base, resolved]) == base to ensure the resolved path is
within the base jail before proceeding to open or serve the file; update the
error branch to return a 403 if the commonpath check fails.
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. Reviewed 32 files across 5 dimensions: CLAUDE.md adherence, shallow bug scan, git history context, previous PR comment patterns, and code comment compliance. Six candidate issues were identified and scored; all fell below the confidence threshold (highest was 50/100 for the ESLint MongoDB shell exclusion, which is technically justified by structurally incompatible syntax). |
Summary
with animated SVG diagrams, real-time latency, built-in SQL console, and EC2 support
read/write concerns, denormalization), Cassandra (multi-node ring, tunable consistency,
partition keys)
and step-by-step instructions following the 13-section lab template
Test plan
cd 10-databases/visualizer && ./setup.shand open http://localhost:8081cd 10-databases/mysql && ./setup.shand verify replication workscd 10-databases/mongodb && ./setup.shand verify replica set formscd 10-databases/cassandra && ./setup.shand verify 3-node ringvisualizer/cloudformation.yamlon EC2 and access via public IPSummary by CodeRabbit
New Features
Documentation
Chores