Skip to content

Fix config persistence: missing commands, delete handling, multipart upload#237

Merged
logicog merged 2 commits into
logicog:mainfrom
Erdnusschokolade:fix/config-persistence
Jun 1, 2026
Merged

Fix config persistence: missing commands, delete handling, multipart upload#237
logicog merged 2 commits into
logicog:mainfrom
Erdnusschokolade:fix/config-persistence

Conversation

@Erdnusschokolade
Copy link
Copy Markdown
Contributor

Base / Compare

  • Base: logicog/RTLPlayground:main
  • Compare: Erdnusschokolade/RTLPlayground:fix/config-persistence

Description

This addresses multiple issues in the "Save to Flash" path that
together cause configuration to be lost or silently mis-saved across
reboots. Closes #218, closes #134.

Root causes

Multipart upload missing filename argument. The previous
form.append("configuration", new Blob(...)) call did not provide a
filename, which the backend multipart parser requires. This was
likely the primary reason saves were unreliable to begin with — the
parser would fail before any data reached the flash.

UI polling not paused during flash write. Flash writes block the
SoC CPU. Polling requests landing during the write caused intermittent
crashes. The fix adds an isSaving lock and clears the poll interval
in sendConfig, restoring polling in the finally block.

Incomplete conf_cmds whitelist. Many persistable CLI commands
were silently dropped during parseConf: syslog, passwd, pvid,
ingress, port name, lag, laghash, isolate, stp, igmp,
mtu, bw, vlan N mgmt, vlan N d. The new whitelist covers all
of these with anchored, precise regex patterns.

VLAN regex blocked named VLANs. The old pattern required digit
port tokens immediately after the VLAN ID, so any name (e.g. vlan 10 iot 1u 2t) failed to match. The new pattern allows an optional name
that must start with a letter, matching the CLI parser's isletter()
behavior.

vlan N d was not persisted. Delete commands were filtered out
entirely, so deleted VLANs reappeared after reboot. parseConf now
intercepts vlan N d before the whitelist test, removes the matching
vlan N ... entry from configuration[], and does not store the
delete command itself. The saved config describes the end state, not
the command sequence.

configuration[] not cleared between saves. Stale entries from
previous interactions in the same session could leak into the saved
config. Cleared at the start of flashSave.

conf_overwrite boundary bug. pvid 1 matched against
pvid 10, pvid 11, etc. Boundary fix using \b in patterns and
adding the trailing space in the startsWith check.

conf_overwrite over-broad vlan pattern. The pattern
/^vlan\s+\d{1,4}\b/ matched both membership and management
entries, so vlan 44 mgmt would silently remove
vlan 44 management 2t 5u and vice versa — locking out the web UI
on next boot. Split into two patterns using negative lookahead so
membership and management entries dedup independently.

Patterns now anchored. All conf_cmds patterns use ^...$ for
full-line match. parseConf normalizes whitespace before testing.

Port range widened. Port tokens now accept two digits, so ports
11 and above are persisted correctly (side fix while reworking the
regex).

Relationship to other PRs

This fix is technically independent of #233 (VLAN overview UI) — it
builds cleanly on current main without any of the overview code.

That said, #233 makes this bug significantly more visible. With the
overview UI, users actively name and manage VLANs through the
interface, so losing those settings on reboot becomes apparent
immediately. While I understand a UI rewrite is in progress (per the
discussion in #219), the underlying persistence bugs affect every
non-trivial configuration today and seem worth fixing in the current
UI rather than waiting for the rewrite.

Credit

A substantial part of the structural rework in config.js and
system.js is ported from @mcaptur's closed PR #219, which @logicog
had already reviewed and positively tested. Specifically taken from
that PR:

  • Multipart upload filename argument ("config.txt")
  • isSaving lock and clearInterval/restore around sendConfig
  • configuration = [] at the start of flashSave
  • await-based merge in flashSave (replacing nested .then())
  • Whitespace normalization in parseConf
  • ^-anchoring strategy for conf_cmds
  • Boundary fix in conf_overwrite (trailing-space startsWith)
  • break after first conf_overwrite match
  • cmd_log_clear after successful save

New in this PR:

  • Expanded conf_cmds whitelist (precise patterns instead of .*
    wildcards) covering syslog, passwd, pvid, ingress, port name, lag,
    laghash, isolate, stp, igmp, mtu, bw, vlan mgmt, vlan delete
  • Named VLAN regex matching the CLI's isletter() semantics
  • vlan N d end-state handling (saved config describes end state
    rather than command sequence)
  • Negative-lookahead split in conf_overwrite to keep VLAN
    membership and VLAN management entries independent
  • \b-based prefix matching in conf_overwrite (replaces the
    earlier trailing \s+ approach which broke the boundary check)
  • Two-digit port range across all patterns

Testing

Tested on KP-9000-6XH-X. Verified positive cases (command set,
"Save settings to flash", reboot, confirmed active and present in
saved config):

  • Named VLANs (vlan 10 iot 1u 2t)
  • vlan N d (delete) — saved config describes end state, no
    delete command stored, no leftover entry
  • vlan N mgmt together with vlan N <members> — both persist
    and remain active after reboot (regression test for the original
    membership/management overwrite scenario)
  • pvid, ingress, port name, passwd, lag, laghash,
    isolate, stp off, igmp on/off, mtu, bw
  • Repeated "Save settings to flash" without reload — no stale
    entries, results consistent

Note: stp on was not exercised in detail. During a quick test,
enabling STP made the switch's management interface unreachable
(traffic between normal ports continued to forward, but the CPU
port appeared not to recover). This looks like a separate STP
implementation issue and is not in scope here — the whitelist entry
stp on/off persists correctly either way.

Verified negative cases (invalid or incomplete commands should not
end up in saved config):

  • vlan alone, pvid alone, xyz 123, stp maybe — all
    correctly dropped by the whitelist

(Side observation during testing: stp maybe is silently
interpreted as "disable" by the CLI parser. Whitelist filtering
prevents this from being persisted, so reboot does not propagate
the unintended disable. Separate issue, not in scope here.)

Multiple related bugs in the Save-to-Flash path:

1. Multipart upload was missing the required filename argument,
   causing the backend parser to fail. Added 'config.txt' to the
   form.append call. This was likely the primary reason Save-to-Flash
   was unreliable.

2. Web UI was not pausing its polling interval during flash write,
   causing CPU contention and intermittent crashes. Added isSaving
   lock and clearInterval before sendConfig.

3. conf_cmds whitelist was incomplete. Added: syslog, passwd, pvid,
   ingress, port name, lag, laghash, isolate, stp, igmp, mtu, bw,
   vlan N mgmt, vlan N d.

4. VLAN regex blocked named VLANs. New pattern allows optional name
   (starts with letter, matching CLI parser semantics).

5. vlan N d (delete) was not persisted. parseConf now removes the
   matching vlan N ... entry from configuration[] when seeing a
   delete command, without storing the delete itself. Result: the
   saved config describes the end state.

6. configuration[] was not cleared between flashSave invocations,
   leading to stale entries from prior interactions.

7. conf_overwrite boundary fix: 'pvid 1' no longer matches 'pvid 10'
   etc. Added trailing space in startsWith check.

8. All conf_cmds patterns now anchored with ^...$ for full-line
   match. parseConf normalizes whitespace before testing.

9. Port range widened to \d{1,2} so ports 11+ are accepted.

Structural fixes (1, 2, 6, 7, 8) ported from mcaptur's closed PR logicog#219;
remaining fixes (3, 4, 5, 9) and overall regex strategy are new.
The pattern /^vlan\s+\d{1,4}\b/ matched both VLAN membership entries
(vlan N <ports>) and management entries (vlan N mgmt), causing them
to overwrite each other in configuration[]. When saving, whichever
form came last in the cmd_log would dedupe the other out of the
final config — resulting in either lost membership or a stale mgmt
setting.

Split into two patterns using negative lookahead:
- /^vlan\s+\d{1,4}\s+mgmt$/      matches only mgmt entries
- /^vlan\s+\d{1,4}(?!\s+mgmt\b)/  matches everything else

Both dedup independently. Discovered via hardware test on 6XH-X
where 'vlan 44 mgmt' silently dropped 'vlan 44 management 2t 5u'
from saved config, locking out web UI on next boot.
@logicog logicog requested a review from vDorst May 24, 2026 12:27
@logicog logicog self-assigned this May 26, 2026
@logicog
Copy link
Copy Markdown
Owner

logicog commented May 26, 2026

I started testing and so far did not find issues, but I want to do more tests later this evening. Also did not really go through the code, yet.

@logicog
Copy link
Copy Markdown
Owner

logicog commented May 26, 2026

UI polling not paused during flash write. Flash writes block the
SoC CPU. Polling requests landing during the write caused intermittent
crashes. The fix adds an isSaving lock and clears the poll interval
in sendConfig, restoring polling in the finally block.

This correctly describes the symptoms, but I believe the reason is different: The issue is that the tiny web-server in the switch can only handle one request at a time. So while the flashing request is ongoing, the web-server cannot answer to other requests, such as the polling. It is of course still true that flashing busy waits for the flash to be complete, but it does not block the CPU.

@logicog
Copy link
Copy Markdown
Owner

logicog commented Jun 1, 2026

Sorry for the delay, finally found time for testing (I hate testing, I love trying to figure out how some piece of hardware works). I am not able to find an issue, and code looks good. So will merge.

@logicog logicog merged commit 5a8a8ed into logicog:main Jun 1, 2026
@logicog
Copy link
Copy Markdown
Owner

logicog commented Jun 1, 2026

Thanks so much for taking care of the issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Config not always saved System seetings / Update seetings : Not recording correctly

2 participants