Skip to content

Conversation

@gorcha
Copy link
Contributor

@gorcha gorcha commented Jan 7, 2026

The SAV writer functions that emit value labels for long string variables assume that the variable storage width is wider than the value label keys, but this is not checked during file writing so ReadStat can produce invalid files that don't throw an error on write (see tidyverse/haven#537).

This PR initially added a check for this during writing, including a minor update to the test suite to support tests with a manually specified storage width. While I was testing I noticed that PSPP isn't happy with using the storage width if it is wider than the user width and you get an error like:

warning: `test-550.sav' near offset 0x18c: Ignoring long string value label record for variable a because the record's width (16) does not match the variable's width (10).

I've updated the long string value writer to use the user width instead of storage width, and confirmed files now load successfully in PSPP (and still roundtrip successfully in ReadStat).

I think this fixes #323, tidyverse/haven#550 and Roche/pyreadstat#264.

@gorcha gorcha changed the title Validate long string value label key length against storage width in SAV writer Use user width when writing long string value label records in SAV writer Jan 7, 2026
@evanmiller evanmiller merged commit 22f28ed into WizardMac:dev Jan 7, 2026
11 of 12 checks passed
@evanmiller
Copy link
Contributor

Thanks!

@gorcha gorcha deleted the sav-long-val-labels branch January 7, 2026 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

readstat strips value_labels from str columns

2 participants