Gamestudio Links
Zorro Links
Newest Posts
AlpacaZorroPlugin v1.3.0 Released
by kzhao. 05/19/24 18:45
Free Live Data for Zorro with Paper Trading?
by AbrahamR. 05/18/24 13:28
Change chart colours
by 7th_zorro. 05/11/24 09:25
Data from CSV not parsed correctly
by dr_panther. 05/06/24 18:50
AUM Magazine
Latest Screens
The Bible Game
A psychological thriller game
SHADOW (2014)
DEAD TASTE
Who's Online Now
4 registered members (AndrewAMD, 7th_zorro, TedMar, Ayumi), 800 guests, and 2 spiders.
Key: Admin, Global Mod, Mod
Newest Members
Hanky27, firatv, wandaluciaia, Mega_Rod, EternallyCurious
19051 Registered Users
Previous Thread
Next Thread
Print Thread
Rate Thread
Bug: file_str_readtow #413022
12/06/12 16:00
12/06/12 16:00
Joined: Nov 2012
Posts: 62
Istanbul
T
Talemon Offline OP
Junior Member
Talemon  Offline OP
Junior Member
T

Joined: Nov 2012
Posts: 62
Istanbul
The engine function file_str_readtow does not take into account the BOM of a UCS-2 LE encoded text file and adds a non-visible character (0xFEFF) at the beginning of the first string. If files should not contain a BOM, could you please write it explicitly?(Although I would prefer if it handled the BOM as notepad++ does not let me create UCS-2 files without a BOM)

Re: Bug: file_str_readtow [Re: Talemon] #413023
12/06/12 16:09
12/06/12 16:09
Joined: Jul 2000
Posts: 27,986
Frankfurt
jcl Offline

Chief Engineer
jcl  Offline

Chief Engineer

Joined: Jul 2000
Posts: 27,986
Frankfurt
file_str_readtow adds no non-visible character 0xFEFF.

Re: Bug: file_str_readtow [Re: jcl] #413053
12/07/12 08:05
12/07/12 08:05
Joined: Nov 2012
Posts: 62
Istanbul
T
Talemon Offline OP
Junior Member
Talemon  Offline OP
Junior Member
T

Joined: Nov 2012
Posts: 62
Istanbul
I have isolated the code and reproduced the situation. Code and the file is here: http://talemon.com/share/bugtest.zip

I've written "test" in two different lines in a text file and saved it in notepad++ as UCS-2 LE. I've read the file line by line with file_str_readtow. I then use diag_var to output the results of str_len calls on two strings I've read to. First call returns 6 whereas second one returns 4.

I've also looked the file with a hex-editor and it starts with a 0xFEFF which is the BOM.

If you could look at it and tell me why I'm getting different results I would really appreciate it.
Thanks.

Re: Bug: file_str_readtow [Re: Talemon] #413059
12/07/12 10:25
12/07/12 10:25
Joined: Jul 2000
Posts: 27,986
Frankfurt
jcl Offline

Chief Engineer
jcl  Offline

Chief Engineer

Joined: Jul 2000
Posts: 27,986
Frankfurt
Thanks for your demo. No function adds some invisible character, the character was already in your text file. String functions just manipulate character strings. They do not care about the coding or what a certain character means.

BTW, why did you start a level for reading strings?

Re: Bug: file_str_readtow [Re: jcl] #413063
12/07/12 11:28
12/07/12 11:28
Joined: Nov 2012
Posts: 62
Istanbul
T
Talemon Offline OP
Junior Member
Talemon  Offline OP
Junior Member
T

Joined: Nov 2012
Posts: 62
Istanbul
Originally Posted By: jcl
String functions just work with character strings. They do not care about the coding or what a certain character means.

But file_str_readtow deals with files and it must and certainly does care about coding of a file.(Otherwise my sample program would not break when I change the encoding.)

Originally Posted By: jcl
No function adds some invisible character, the character was already in your text file.

That 'character' is the byte-order mark and it is essential in recognizing UCS-2 text: (http://en.wikipedia.org/wiki/UCS-2#Byte_order_encoding_schemes)

Originally Posted By: jcl
BTW, why did you start a level for reading text?

I just used the code template SED comes with and removed some lines.

Re: Bug: file_str_readtow [Re: Talemon] #413075
12/07/12 15:16
12/07/12 15:16
Joined: Jul 2000
Posts: 27,986
Frankfurt
jcl Offline

Chief Engineer
jcl  Offline

Chief Engineer

Joined: Jul 2000
Posts: 27,986
Frankfurt
Yes, I understand that 0xFEFF serves as a byte-order mark in the coding that you use, but for string functions it's just 2 bytes. A string has no coding. The coding is up to the programmer. The only special character in a string is the 0 that marks the end.

For finding characters with a special meaning in strings you can use the str_chr function. This way you can remove those characters for displaying the content of the string.

Re: Bug: file_str_readtow [Re: jcl] #413080
12/07/12 17:02
12/07/12 17:02
Joined: Nov 2012
Posts: 62
Istanbul
T
Talemon Offline OP
Junior Member
Talemon  Offline OP
Junior Member
T

Joined: Nov 2012
Posts: 62
Istanbul
Maybe I should re-write my original problem:

The function "file_str_readtow" which reads text files doesn't properly processes those files and adds the file's metadata to the content. I think you should fix file_str_readtow so that it discards that BOM. Even better, it can use it so the engine can process both little-endian and big-endian text files.

Re: Bug: file_str_readtow [Re: Talemon] #413087
12/07/12 18:48
12/07/12 18:48
Joined: Jul 2000
Posts: 27,986
Frankfurt
jcl Offline

Chief Engineer
jcl  Offline

Chief Engineer

Joined: Jul 2000
Posts: 27,986
Frankfurt
Users of a programming language expect string functions to do only what was specified. They are not supposed to process codes or discard characters. This would cause lots of complains and hard to find errors.

The obvious solution to your problem is writing a function to be applied to strings, which processes your UCS-2 code and deletes its control characters.


Moderated by  HeelX, Spirit 

Gamestudio download | chip programmers | Zorro platform | shop | Data Protection Policy

oP group Germany GmbH | Birkenstr. 25-27 | 63549 Ronneburg / Germany | info (at) opgroup.de

Powered by UBB.threads™ PHP Forum Software 7.7.1