1 registered members (7th_zorro),
1,390
guests, and 2
spiders. |
Key:
Admin,
Global Mod,
Mod
|
|
|
Re: UNICODE Text to String
[Re: Benni003]
#417109
02/08/13 13:57
02/08/13 13:57
|
Joined: Apr 2005
Posts: 4,506 Germany
fogman
Expert
|
Expert
Joined: Apr 2005
Posts: 4,506
Germany
|
You´re right! This seems to be a bug, because the following works. Here, I read str_pointer twice: Content of "null.txt": NULL NULL Content of "file.txt": Hello NULL
#include <acknex.h>
#include <default.c>
STRING* str1 = "#1";
STRING* str2 = "#1";
STRING* str_pointer = "#1";
function main()
{
var file;
file = file_open_read("null.txt");
file_str_readtow(file,str_pointer,NULL,5000); // NULL
file_str_readtow(file,str_pointer,NULL,5000); // NULL
file_close(file);
file = file_open_read("file.txt");
file_str_readtow(file,str1,NULL,5000); // Hello
file_str_readtow(file,str2,NULL,5000); // NULL
file_close(file);
if(str_cmpi(str2,str_pointer)!=0){error("It works!");} // str_pointer includes "NULL" from other file
}
I use mostly txt_loadw for ingame text, not file_str_readtow, so I didn´t come across this bug. You should give a bug report to jcl.
no science involved
|
|
|
Re: UNICODE Text to String
[Re: fogman]
#417113
02/08/13 15:00
02/08/13 15:00
|
Joined: Apr 2007
Posts: 3,751 Canada
WretchedSid
Expert
|
Expert
Joined: Apr 2007
Posts: 3,751
Canada
|
It's not a bug per se but works as intended! The first character in a unicode text is the so called BOM ( byte order mark) which is put there because you can encode a Unicode text as big endian or little endian, so the BOM is there to signal the endianess of the encoded text. Now, you can argue that the read function should just ignore the BOM, but it's actually part of the text, just like every other control character, so you can also argue that it should be there. As a solution; Open the file, read the first character (16bit) and check if it's the BOM (because some retards write editors that don't include the BOM for whatever reason), and then either seek back one character or just continue. The BOM has the code point U+FEFF, but you should read it as two characters and compare them, unless you know how to write a function that reverses the byte order of something.
Shitlord by trade and passion. Graphics programmer at Laminar Research. I write blog posts at feresignum.com
|
|
|
Re: UNICODE Text to String
[Re: WretchedSid]
#417115
02/08/13 15:19
02/08/13 15:19
|
Joined: Nov 2012
Posts: 62 Istanbul
Talemon
Junior Member
|
Junior Member
Joined: Nov 2012
Posts: 62
Istanbul
|
It's not a bug per se but works as intended! The first character in a unicode text is the so called BOM ( byte order mark) which is put there because you can encode a Unicode text as big endian or little endian, so the BOM is there to signal the endianess of the encoded text. Now, you can argue that the read function should just ignore the BOM, but it's actually part of the text, just like every other control character, so you can also argue that it should be there. As a solution; Open the file, read the first character (16bit) and check if it's the BOM (because some retards write editors that don't include the BOM for whatever reason), and then either seek back one character or just continue. The BOM has the code point U+FEFF, but you should read it as two characters and compare them, unless you know how to write a function that reverses the byte order of something. Hah! I knew this day would come: http://www.opserver.de/ubb7/ubbthreads.php?ubb=showflat&Number=413022#Post413022
|
|
|
|