News:

When registered with our forums, feel free to send a "here I am" post here to differ human beings from SPAM bots.

Main Menu

When will support UTF-8 editor?

Started by dbtsai, December 12, 2005, 12:45:52 PM

Previous topic - Next topic

dbtsai

In the official release or SVN release, I find that Code::Blocks compiled with ANSI mode rather than UTF-8.

Because I am from Taiwan, some chinese words can not display in the Code::Blocks, even some simple comment.

So, I realy realy hole that it could support read utf-8 source code.

By the way, will the next offical release ship with the wx lib?? Or we need comiple the wx lib, it's not easy for begainer.


Thanks.

Michael

Hello,

AFAIK C::B RC2 supports UNICODE (http://forums.next.codeblocks.org/index.php?topic=1162.0).

You should compile wxWidgets with UNICODE and then C::B sources.

You can also download Therion's wxWindows 2.6.2 build (see http://paginas.terra.com.br/informatica/mauricio/codeblocks/). This package includes dll and static libraries for GCC 3.4.4 (both Unicode and NonUnicode).

Michael
[url="http://img207.imageshack.us/img207/9728/411948picture4em.png"]http://img207.imageshack.us/img207/9728/411948picture4em.png[/url]

takeshimiya

There are any disvantages of having C::B compiled in Unicode mode for the official releases (ie. RC3)?

dbtsai

Hi, Michael

In the version, Therion's wxWindows 2.6.2 build,

in HELP-> ABOUT still say wx2.6.2(Windows, ANSI)

and I can not use the code::blocks to open an source code which encode by utf-8.

I know that the lib he provide have utf-8 version, but what i mean is that

the code::blocks editors still can not open utf-8 source.


Thanks~~~  ^_^

Quote from: Michael on December 12, 2005, 01:04:26 PM
Hello,

AFAIK C::B RC2 supports UNICODE (http://forums.next.codeblocks.org/index.php?topic=1162.0).

You should compile wxWidgets with UNICODE and then C::B sources.

You can also download Therion's wxWindows 2.6.2 build (see http://paginas.terra.com.br/informatica/mauricio/codeblocks/). This package includes dll and static libraries for GCC 3.4.4 (both Unicode and NonUnicode).

Michael


takeshimiya

I'm afraid no one is making Unicode builds of Code::Blocks.

Michael

Quote from: Takeshi Miya on December 12, 2005, 01:21:34 PM
I'm afraid no one is making Unicode builds of Code::Blocks.
But you can make a UNICODE build of C::B or? For what I have understood from the post Version 1.0rc2 released!, C::B supports UNICODE.

Michael
[url="http://img207.imageshack.us/img207/9728/411948picture4em.png"]http://img207.imageshack.us/img207/9728/411948picture4em.png[/url]

takeshimiya

Yes anyone can, but no one is distributing builds of C::B Unicode in win32.

C::B supports Unicode means that it can be compiled in Unicode, not that it is compiled in Unicode.

Michael

Quote from: Takeshi Miya on December 12, 2005, 01:43:30 PM
C::B supports Unicode means that it can be compiled in Unicode, not that it is compiled in Unicode.
Ok, so I have understood right. Thank you.

I think, dbtsai, that you should have to make a UNICODE build of C::B with wxWidgets UNICODE from Therion (or with wxWidgets UNICODE compile by yourself if you prefer).

Michael
[url="http://img207.imageshack.us/img207/9728/411948picture4em.png"]http://img207.imageshack.us/img207/9728/411948picture4em.png[/url]

thomas

Quote from: Takeshi Miya on December 12, 2005, 01:06:31 PM
There are any disvantages of having C::B compiled in Unicode mode for the official releases (ie. RC3)?
Yes, there are disadvantages. Unicode support is not 100% finished and tested. Also, at least one third party library used in Code::Blocks does not support wide character strings (even though it apparently still works, somehow).
ANSI, on the other hand, works 100% certain and is officially supported.

No doubt, some day Code::Blocks will switch to Unicode alltogether (as that will work universally), but I dare not say when that will be.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

dbtsai

hi,

Ok, I will try to compile it by myself. If any good news, I will post it. ^_^

And in my case, the a chinese word is use two bytes in ANSI mode,
but in the C::B, when I use delete key, it will only delete one byte, half of a chinese word.
It is not correct.  Most of Chinese or Janpan program need to take this problem into consideration, and
programer need to solve it my theirself, that is why I very very very holp C::B support UTF-8.

Thanks

takeshimiya

Quote from: thomas on December 12, 2005, 02:53:08 PM
Also, at least one third party library used in Code::Blocks does not support wide character strings (even though it apparently still works, somehow).
ANSI, on the other hand, works 100% certain and is officially supported.

What are the specific libraries that doesn't support widechars and what can we do to make them support it, appart from submitting a feature request?

thomas

This is one I know about, and the most important at the same time:
Quote from: http://www.grinninglizard.com/tinyxmldocs/index.htmlTinyXml supports UTF-8 allowing to manipulate XML files in any language.
[...]
TinyXml does not use or directly support wchar, TCHAR, or Microsofts _UNICODE at this time.

Apparently, it still works ... somehow. Although I do not understand how it works, it actually seems to do o.k. in Unicode builds. But it still does not feel good.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

thomas

"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

280Z28

I tried but didn't have time to fight with it. It's running stable in ANSI so I left it there.  :?

Once I take care of my "level 1 problems (most important bugs to fix IMO)," I might work on this again.
78 280Z, "a few bolt-ons" - 12.71@109.04
99 Trans Am, "Daily Driver" - 525rwhp/475rwtq
Check out The Sam Zone :cool:

kagerato

Quote from: http://www.grinninglizard.com/tinyxmldocs/index.htmlTinyXml supports UTF-8 allowing to manipulate XML files in any language.
[...]
TinyXml does not use or directly support wchar, TCHAR, or Microsofts _UNICODE at this time.

This makes little sense to me.  UTF-8 is a particular representation of Unicode text requiring at least 8 bits per character, widely used because it's 1:1 with ASCII.  Supporting UTF-8 should be enough for unicode operability in any language.

WCHAR and TCHAR are just Windows-specific typedef's, as far as I know.  (Reference: MSDN)

_UNICODE is a preprocessor definition used by Microsoft's compiler. (Reference: Microsoft)

What, then, do WCHAR, TCHAR, and _UNICODE have to do with proper/complete implementation of unicode support?