This repository was archived by the owner on Aug 12, 2018. It is now read-only.
Bring back the ability to edit files using the system default code page#193
Open
jberezanski wants to merge 1 commit intoXhmikosR:masterfrom
Open
Bring back the ability to edit files using the system default code page#193jberezanski wants to merge 1 commit intoXhmikosR:masterfrom
jberezanski wants to merge 1 commit intoXhmikosR:masterfrom
Conversation
Many non-Unicode text files on non-English systems are encoded in the system-default code page. Users expect to be able to edit such files. However, in version 3.6.7, Scintilla developers decided to break this scenario by equating the default (=unspecified) code page with code page 1252 (Western European). This causes Scintilla to mistreat international characters typed by the user - they either show as non-accented latin letters or as question marks. The only way to avoid this behavior in Notepad2-mod is to set the file encoding manually. Internally, Notepad2-mod attempts to do the right thing. The encoding described in the UI as "ANSI" is internally mapped to CPI_DEFAULT and Notepad2-mod treats it as using the system default code page, as evidenced by code which adds to the description of this encoding the output of the GetACP() Win32 function (Edit.c, function Encoding_InitDefaults()). So, for example, on a Polish system the ANSI encoding option (in the encoding selection dialogs) is shown as "ANSI (1250)". Due to the change in Scintilla, however, this is no longer accurate - Scintilla will not use code page 1250 (the default code page on that system), but the hardcoded 1252. In Scintilla change history, the change in 3.6.7 is described as "[preventing] unexpected behavior and crashes on East Asian systems". It is the opinion of this developer that using the system default code page by default is, in fact, the *expected* behavior from the user point of view (and Notepad2-mod is perfectly capable of handling multi-byte encodings correctly), so the reasoning for the change is invalid and the change should be reverted. Which this commit does. (For comparison, the other popular Scintilla-based editor, Notepad++, currently uses an older Scintilla version (3.5.6), so it did not encounter this issue yet.) Fixes XhmikosR#173.
RaiKoHoff
added a commit
to RaiKoHoff/Notepad3
that referenced
this pull request
Aug 5, 2017
- consistent encoding <> code-page handling (including Scintilla's code-page settings) - Scintilla issue regarding notepad2-mod issie rizonesoft#173 (see XhmikosR/notepad2-mod#193) - allow arbitrary conversion between encodings (even it it does not make sense in any case) (instead of doing silently nothing but changing encoding info on status bar)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Many non-Unicode text files on non-English systems are encoded in the
system-default code page. Users expect to be able to edit such files.
However, in version 3.6.7, Scintilla developers decided to break this
scenario by equating the default (=unspecified) code page with code page
1252 (Western European). This causes Scintilla to mistreat international
characters typed by the user - they either show as non-accented latin
letters or as question marks. The only way to avoid this behavior in
Notepad2-mod is to set the file encoding manually.
Internally, Notepad2-mod attempts to do the right thing. The encoding
described in the UI as "ANSI" is internally mapped to CPI_DEFAULT and
Notepad2-mod treats it as using the system default code page, as
evidenced by code which adds to the description of this encoding the
output of the GetACP() Win32 function
(Edit.c, function Encoding_InitDefaults()).
So, for example, on a Polish system the ANSI encoding option (in the
encoding selection dialogs) is shown as "ANSI (1250)". Due to the change
in Scintilla, however, this is no longer accurate - Scintilla will not
use code page 1250 (the default code page on that system), but the
hardcoded 1252.
In Scintilla change history, the change in 3.6.7 is described as
"[preventing] unexpected behavior and crashes on East Asian systems".
It is the opinion of this developer that using the system default code
page by default is, in fact, the expected behavior from the user point
of view (and Notepad2-mod is perfectly capable of handling multi-byte
encodings correctly), so the reasoning for the change is invalid and the
change should be reverted. Which this commit does.
(For comparison, the other popular Scintilla-based editor, Notepad++,
currently uses an older Scintilla version (3.5.6), so it did not
encounter this issue yet.)
Fixes #173.