Skip to content

Config::Tiny does not work with UTF8 encoded INI with Byte Order Marks#1

Open
Outworldz wants to merge 98 commits into
rsrchboy:masterfrom
ronsavage:master
Open

Config::Tiny does not work with UTF8 encoded INI with Byte Order Marks#1
Outworldz wants to merge 98 commits into
rsrchboy:masterfrom
ronsavage:master

Conversation

@Outworldz

@Outworldz Outworldz commented Nov 7, 2021

Copy link
Copy Markdown

Nice module, thanks!
I spotted a glitch on Windows 11 using Strawberry Perl.

Edit:: It appears this is a common problem in Perl core. I was able to workaround it using File::BOM in my Perl code.

UTF-8 chars in a plain ASCII file works in Config::Tiny when I read it as UTF8 without the BOM. If there is a BOM it aborts.

>>Wide character in die at lib/Util.pm line 24.
>>Syntax` error at line 1: '[Data]' at lib/Util.pm line 24.

I have UTF-8 in my INIs. My servers already have BOM's in the file. BOM is the (hexadecimal) byte sequence 0xEF,0xBB,0xBF. They were written using standard Dot Net encoding as UTF-8, which adds the BOM to the beginning.

Dim contents = "[Data]" + vbCrLf
Using outputFile = New StreamWriter(New FileStream(_myINI, FileMode.Create, FileAccess.ReadWrite), Encoding.UTF8)
outputFile.WriteLine(contents)
End Using

I am reading it like this:

# Create an empty config
my $Config = Config::Tiny->new;          
Line 24: $Config = Config::Tiny->read( 'Settings.ini', 'utf8') || die Config::Tiny->errstr; 

Error is:

Wide character in die at lib/Util.pm line 24.
Syntax error at line 1: '[Data]' at lib/Util.pm line 24.

When I edit and save the file without the BOM, it will work, even with UTF-8 in the file later.

I can cure this by not writing the Byte Order Mark at the beginning of the file:

>> Dim utf8WithoutBom = New UTF8Encoding(False)
>>  Using outputFile = New StreamWriter(New FileStream(_myINI, FileMode.Create, FileAccess.ReadWrite), utf8WithoutBom)

Its not possible for me to workaround by cleaning the INI of BOMs as I use other Dot Net INI Nuget modules to manipulate the INI, and always end up back with the BOM issue.

tl;dr
Writing INI as UTF8 fails due to BOM.
Writing INI as UTF8 without a BOM works.
Encoding INI as ASCII and reading it as UTF-8 works. (Encoding.ASCII)
Changing the UTF-8 file by deleting the BOM works, even with UTF-8 in it.
UTF-8 chars in the plain ASCII file works in Config::Tiny when I read it as UTF8 without the BOM.

ronsavage and others added 30 commits February 1, 2021 15:23
Users may now say Config::Tiny->new($hashref) to bless this hashref into
the package and turn it into a Config::Tiny object.
Add ability to pass a hashref to constructor
Do not use bareword file handles when opening files for read/write.
Remove unnecessary Test::Pod dependency again
Add arrays, defined with the characters "[]" after a key name
Fix LGPL name in LICENSE file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants