UNICODE Text to String

Gamestudio Links

Zorro Links

Newest Posts

Zorro Calculate Commission and Spread Even Setting Is Off
by Zheka. 05/16/24 10:41

Change chart colours
by 7th_zorro. 05/11/24 09:25

Texture/Sprite animation tool. Someone interested in ?
by NeoDumont. 05/10/24 19:36

Seeking Advice: Trading Two Algorithms for SPX500 / US Treasury
by AndrewAMD. 05/07/24 00:56

Data from CSV not parsed correctly
by dr_panther. 05/06/24 18:50

Zorro Mistakenly Changes Another Assets Parameter in Loop
by dr_panther. 05/06/24 18:24

tradeUpdatePool() but for only exiting positions
by AndrewAMD. 05/03/24 15:20

FOSS C++ DLL framework with a Python subsystem and git hooks.
by firatv. 05/02/24 09:40

AUM Magazine

Latest Screens

Who's Online Now

1 registered members (7th_zorro), 1,390 guests, and 2 spiders.

Key: Admin, Global Mod, Mod

Newest Members

Hanky27, firatv, wandaluciaia, Mega_Rod, EternallyCurious
19051 Registered Users

Print Thread

Rate Thread

Page 2 of 2

Re: UNICODE Text to String [Re: Benni003] #417109
02/08/13 13:57 02/08/13 13:57

Joined: Apr 2005
Posts: 4,506
Germany

fogman

Expert

fogman

Expert

Joined: Apr 2005
Posts: 4,506
Germany

You�re right!
This seems to be a bug, because the following works.
Here, I read str_pointer twice:

Content of "null.txt":
NULL
NULL

Content of "file.txt":
Hello
NULL

Code:

#include <acknex.h>
#include <default.c>

STRING* str1 = "#1";
STRING* str2 = "#1";
STRING* str_pointer = "#1";


function main()
{
 	var file;

	file = file_open_read("null.txt");
	file_str_readtow(file,str_pointer,NULL,5000); // NULL
	file_str_readtow(file,str_pointer,NULL,5000); // NULL
	file_close(file);
 	file = file_open_read("file.txt");
 
	file_str_readtow(file,str1,NULL,5000); // Hello
 	file_str_readtow(file,str2,NULL,5000); // NULL
	file_close(file);
	
	if(str_cmpi(str2,str_pointer)!=0){error("It works!");} // str_pointer includes "NULL" from other file
}

I use mostly txt_loadw for ingame text, not file_str_readtow, so I didn�t come across this bug.
You should give a bug report to jcl.

no science involved

Re: UNICODE Text to String [Re: fogman] #417113 02/08/13 15:00 02/08/13 15:00
Joined: Apr 2007 Posts: 3,751 Canada WretchedSid Expert
WretchedSid Expert Joined: Apr 2007 Posts: 3,751 Canada	It's not a bug per se but works as intended! The first character in a unicode text is the so called BOM (byte order mark) which is put there because you can encode a Unicode text as big endian or little endian, so the BOM is there to signal the endianess of the encoded text. Now, you can argue that the read function should just ignore the BOM, but it's actually part of the text, just like every other control character, so you can also argue that it should be there. As a solution; Open the file, read the first character (16bit) and check if it's the BOM (because some retards write editors that don't include the BOM for whatever reason), and then either seek back one character or just continue. The BOM has the code point U+FEFF, but you should read it as two characters and compare them, unless you know how to write a function that reverses the byte order of something. Shitlord by trade and passion. Graphics programmer at Laminar Research. I write blog posts at feresignum.com

Re: UNICODE Text to String [Re: WretchedSid] #417115 02/08/13 15:19 02/08/13 15:19
Joined: Nov 2012 Posts: 62 Istanbul T Talemon Junior Member
Talemon Junior Member T Joined: Nov 2012 Posts: 62 Istanbul	Originally Posted By: JustSid It's not a bug per se but works as intended! The first character in a unicode text is the so called BOM (byte order mark) which is put there because you can encode a Unicode text as big endian or little endian, so the BOM is there to signal the endianess of the encoded text. Now, you can argue that the read function should just ignore the BOM, but it's actually part of the text, just like every other control character, so you can also argue that it should be there. As a solution; Open the file, read the first character (16bit) and check if it's the BOM (because some retards write editors that don't include the BOM for whatever reason), and then either seek back one character or just continue. The BOM has the code point U+FEFF, but you should read it as two characters and compare them, unless you know how to write a function that reverses the byte order of something. Hah! I knew this day would come: http://www.opserver.de/ubb7/ubbthreads.php?ubb=showflat&Number=413022#Post413022

Page 2 of 2

Moderated by HeelX, Lukas, rayp, Rei_Ayanami, Superku, Tobias, TWO, VeT