fix: decode APP0 segments using declared segment length#27
Open
byrnehollander wants to merge 1 commit into
Open
fix: decode APP0 segments using declared segment length#27byrnehollander wants to merge 1 commit into
byrnehollander wants to merge 1 commit into
Conversation
21a64b7 to
5f9a11c
Compare
APP0 was decoded as a fixed-size JFIF struct, leaving any remaining segment bytes (thumbnails, JFXX payloads, trailing data such as Apple MPF "AMPF") in the stream and desynchronizing the parser on the next marker. Decode the payload as length-framed instead, matching how the EXIF marker is handled, and re-enable ExifTool.jpg in the snapshot suite.
5f9a11c to
755430d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix APP0/JFIF decoding so jay-peg consumes the full APP0 segment payload described by the JPEG segment length, instead of assuming the canonical fixed-size JFIF header is the entire segment.
Previously, any APP0 bytes beyond the fixed-size JFIF struct (thumbnails, JFXX payloads, trailing data) were left in the stream, so the parser desynchronized and misread the next marker:
AMPF) in the APP0/JFIF segment before APP1/Exif. The leftover0x414d(AM) was read as the next marker type and decode failed withUnknown version 16717.tests/images/ExifTool.jpgfixture (which carries a JFXX segment) failed the same way withUnknown version 1542, which is why it was excluded from the snapshot suite.Fixes #21. Fixes #9 — the first regression test below embeds the exact segment bytes reported there. Also implements the JFXX parsing requested in #17: the extension code and raw payload are exposed (
extensionCode,data), though thumbnail contents aren't further decoded.This failure mode also surfaces downstream when react-pdf embeds JPEGs: diegomura/react-pdf#2734 reports
Unknown versionerrors thrown from this same decode path for images that render fine elsewhere, which is likely this desync.What changed
exif.jsalready uses for APP1.JFIF\0payloads; expose thumbnail bytes asthumbnailand any extra trailing bytes asdata.JFXX\0payloads (name: "JFXX",extensionCode,data) without desynchronizing the stream.APP0with theiridentifierand rawdata.Invalid APP0 length Nwhen the declared length is smaller than the 2 length bytes themselves, andTruncated APP0 segment: declared length N exceeds remaining bufferwhen it points past the end of the input.Behavior notes
ExifTool.jpgentries.typebehavior is intentionally preserved to avoid output churn: JFIF markers still expose the JFIF version intypeafterdecode()maps the internalversionfield. This quirk is pre-existing onmain(the old struct'sversionfield had the same collision) — happy to address it in a follow-up if you'd like.Tests
ExifTool.jpgis re-enabled in the full snapshot suite (thenonPassingImagesexclusion list is now empty and removed); it decodes through JFXX/APP0 all the way to EOI.BufferandUint8Arrayinputs:JFIF ... AMPFfollowed by APP1/Exif, matching the reported failure mode (segment bytes from Parsing fails with non-standard APP0 #9)Validation
Disclosure
I used GPT-5.5 and Claude (Fable) to help write and test this PR.