This is a bug which was existed long time ago.
test code:
#include <stdio.h>
int main(void)
{
printf("Hello World! 测试");
return 0;
}
[attachment deleted by admin]
I suggest posting a link to the file or attaching the file.
Also state the correct encoding and the wrong encoding value detected.
NOTE: If this is a program run-time issue search for the solution because it is NOT a CB issue.
It is posted somewhere on this board.
Tim S.
Quote from: stahta01 on October 15, 2014, 05:36:42 AM
I suggest posting a link to the file or attaching the file.
Also state the correct encoding and the wrong encoding value detected.
NOTE: If this is a program run-time issue search for the solution because it is NOT a CB issue.
It is posted somewhere on this board.
Tim S.
?
I have uploaded a screenshot which include notepad++ and CB open same file. The correct one is notepad++.
It is not a good solution that to choice bypass the encode dectect.
Quote from: edison on October 15, 2014, 04:58:56 AM
This is a bug which was existed long time ago.
Sorry, but I can't reproduce. I've created a new file "main.c" copied/pasted your code snippet into it and it just looks exactly like in the forums and notepad...?!
My Settings are:
- Encoding: Windows 1252
- Use this encoding "as fallback"
- Try to detect...: OFF
- If conversion fails... : ON
However, are you sure you've saved your file in a proper file format like UTF-8?
I have created a video for demo this issue:
https://vimeo.com/108988215 (https://vimeo.com/108988215)
The CB was ran with default settings.
You can reproduce this problem via add language in Windows CP, it is Simplified Chinese(the code page should be Windows-936 or GBK or cp936) here.
Quote from: edison on October 15, 2014, 11:13:02 AM
I have created a video for demo this issue:
I've seen this video. I am asking again:
Quote from: MortenMacFly on October 15, 2014, 08:52:39 AM
However, are you sure you've saved your file in a proper file format like UTF-8?
From your video it seems not. Strange is also that you are not being warned about that issue. Usually C::B does so.
Quote from: MortenMacFly on October 16, 2014, 08:33:49 AM
From your video it seems not. Strange is also that you are not being warned about that issue. Usually C::B does so.
I had uploaded another video which show CB can not correctly detect the utf-8 file that save by itself:
https://vimeo.com/109202854
Quote from: edison on October 17, 2014, 06:27:23 AM
I had uploaded another video which show CB can not correctly detect the utf-8 file that save by itself:
https://vimeo.com/109202854
Well what happens is perfectly OK. As you create an UTF-8 w/o BOM and have setup windows-936 as default encoding it will be used when opening the file. There is no way you can distinguish
exactly between UTF-8 and windows-936 in case you've only ANSI characters in the file.
So either you use UTF-8 with BOM or start just coding your Korean (whats-o-ever) stuff into the file. :)
Quote from: MortenMacFly on October 17, 2014, 07:45:26 AM
Quote from: edison on October 17, 2014, 06:27:23 AM
I had uploaded another video which show CB can not correctly detect the utf-8 file that save by itself:
https://vimeo.com/109202854
Well what happens is perfectly OK. As you create an UTF-8 w/o BOM and have setup windows-936 as default encoding it will be used when opening the file. There is no way you can distinguish exactly between UTF-8 and windows-936 in case you've only ANSI characters in the file.
So either you use UTF-8 with BOM or start just coding your Korean (whats-o-ever) stuff into the file. :)
but why if I use defaut encode(windows-936) to save file and CB will detect it as other encode ? Is it normal? Why other editor(for example notepad++) have not such problem?
Because with the content you have in the file you have multiple options for a valid encoding. They're is no single solution. That's handled differently by editors. That's why I said enter some characters that make it easier for the detection engine to identify your language. We are using the same mechanism Mozilla uses,btw...
...not to forget that another perfect solution is to use a file with bom if the target compiler supports this.
Quote from: MortenMacFly on October 21, 2014, 10:50:15 PM
...not to forget that another perfect solution is to use a file with bom if the target compiler supports this.
But I had encouter a problem when using UTF8 w/BOM:
There is some un-readable charter(s) in the first line (for example, the first line should be #include xxxx, but with UTF8 w/BOM that was changed to ("??")#include xxxx in the CB editor).
I don't know what exactly you do wring, but it works perfectly here:
Steps:
- Create a new file
- enable to use BOM
- save as UTF-8
- close file
- re-open file
-> Result: UTF-8, no matter if I had added ANSI or unicode characters from your example.