Issue Description
In sigflow/parsers/binary.py, the BinaryFrameParser kinda computes payload_end using the length field that came from the header. If the frame checksum validation doesn’t pass , the parser only emits a warning, but it still moves the stream pointer (offset) forward to that payload_end anyway. And if the length value itself was the thing that got corrupted , then the pointer advance can become totally arbitrary, like, just wrong by a lot. so yeah it’s dangerous.
Root Cause Analysis
The parser unconditionally leans on the length field just to update the stream state :
offset = payload_end
When verify_checksum fails , the integrity of the header (including length) is already in doubt. By accepting that unverified length anyway , the parser can jump to an incorrect offset. Also, if the corrupted length ends up being arbitrarily large, the condition payload_end > len(data) will trip sooner than expected inside the loop , so the parser breaks out and stops the entire loop early. Meaning you don’t only skip one frame, you basically end up cutting off the rest.
Reproduction Steps
- Create a binary stream containing multiple valid telemetry frames.
- Take the first frame, then alter its length field to a big integer (like 5000), but don’t update the checksum, simulating a bit-flip corruption.
- Run the CLI command: python -m sigflow parse corrupted_stream.sgf
Observation
The parser will notice the checksum mismatch, sure. But then it will jump forward 5000 bytes . After that, all subsequent valid frames are skipped and won’t appear in the output.
Impact Explanation
This weakness causes severe, silent data loss during ingestion. In operational or forensic pipelines where malformed data is expected, even one corrupted frame header can hide every later valid log entry in the stream, or it can crash the ingestion job outright.
Issue Description
In sigflow/parsers/binary.py, the BinaryFrameParser kinda computes payload_end using the length field that came from the header. If the frame checksum validation doesn’t pass , the parser only emits a warning, but it still moves the stream pointer (offset) forward to that payload_end anyway. And if the length value itself was the thing that got corrupted , then the pointer advance can become totally arbitrary, like, just wrong by a lot. so yeah it’s dangerous.
Root Cause Analysis
The parser unconditionally leans on the length field just to update the stream state :
offset = payload_end
When verify_checksum fails , the integrity of the header (including length) is already in doubt. By accepting that unverified length anyway , the parser can jump to an incorrect offset. Also, if the corrupted length ends up being arbitrarily large, the condition payload_end > len(data) will trip sooner than expected inside the loop , so the parser breaks out and stops the entire loop early. Meaning you don’t only skip one frame, you basically end up cutting off the rest.
Reproduction Steps
Observation
The parser will notice the checksum mismatch, sure. But then it will jump forward 5000 bytes . After that, all subsequent valid frames are skipped and won’t appear in the output.
Impact Explanation
This weakness causes severe, silent data loss during ingestion. In operational or forensic pipelines where malformed data is expected, even one corrupted frame header can hide every later valid log entry in the stream, or it can crash the ingestion job outright.