UNICODE Text to String

Gamestudio Links

Zorro Links

Zorro 2.70
by jcl. 09/29/25 09:24

ZorroGPT
by TipmyPip. 09/27/25 10:05

Plugins update
by Grant. 09/17/25 16:28

AUM Magazine

Latest Screens

3 registered members (NewbieZorro, TipmyPip, 1 invisible), 19,045 guests, and 8 spiders.

Key: Admin, Global Mod, Mod

Newest Members

krishna, DrissB, James168, Ed_Love, xtns
19168 Registered Users

Print Thread

Rate Thread

Re: UNICODE Text to String [Re: fogman] #417113 02/08/13 15:00 02/08/13 15:00
Joined: Apr 2007 Posts: 3,751 Canada WretchedSid Expert
WretchedSid Expert Joined: Apr 2007 Posts: 3,751 Canada	It's not a bug per se but works as intended! The first character in a unicode text is the so called BOM (byte order mark) which is put there because you can encode a Unicode text as big endian or little endian, so the BOM is there to signal the endianess of the encoded text. Now, you can argue that the read function should just ignore the BOM, but it's actually part of the text, just like every other control character, so you can also argue that it should be there. As a solution; Open the file, read the first character (16bit) and check if it's the BOM (because some retards write editors that don't include the BOM for whatever reason), and then either seek back one character or just continue. The BOM has the code point U+FEFF, but you should read it as two characters and compare them, unless you know how to write a function that reverses the byte order of something. Shitlord by trade and passion. Graphics programmer at Laminar Research. I write blog posts at feresignum.com