CSVToHistory always skips the first row from CSV file

Posted By: HamzaAhmed

CSVToHistory always skips the first row from CSV file - 04/28/22 10:28

When Format string is
Code
string Format = "+%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";

or
Code
string Format = "+1%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";

t6 file is created with separate years however the first row of data is never picked up. Please see attachment.

When Format string is
Code
string Format = "-1%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";

Data in t6 file is in ascending order however all rows are picked up. However Zorro works with data in t6 files in descending order.

I have attached the sample CSV file and associated screengrabs as well.

Using Zorro 2.44. Same behaviour with Zorro Beta 2.47.4 Beta.

How do I load the first row of data from my CSV file into T6?

Attached File
63MOONS_eod.csv  (60 downloads)
Attached picture Zorro Misses the First Row.png
Attached picture Zorro Misses the First Row - 1.png
Attached picture Zorro Works Fine In Ascending Order.png
Posted By: Petra

Re: CSVToHistory always skips the first row from CSV file - 04/28/22 14:48

https://zorro-project.com/manual/en/data.htm

Set header lines to 0 when you have no header line.
Posted By: HamzaAhmed

Re: CSVToHistory always skips the first row from CSV file - 04/30/22 07:20

I do have a header line.

datetime,symbol,open,high,low,close,volume

Nonetheless, setting Format either to

Code
string Format = "0%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";
or
Code
string Format = "+0%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


produces

Error 058: Bad date for 'datetime,s' in 63MOONS_eod.csv
0 lines read


If I do not set any + or - in the Format String

Code
string Format = "%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


Then Zorro by default loads in ascending date order and the first row is read.

If I set Format as

Code
string Format = "+2%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


Then 2 rows of data is omitted.

If I delete the header line datetime,symbol,open,high,low,close,volume and set Format as
Code
string Format = "+0%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


Then I do not get Error:058 however the first row is still missing

If I delete the header line datetime,symbol,open,high,low,close,volume and set Format as
Code
string Format = "+%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


Then 2 rows of data are missing

Thanks
Posted By: HamzaAhmed

Re: CSVToHistory always skips the first row from CSV file - 04/30/22 08:24

Worked around the oddity by duplicating the first row of data in my csv file. This makes Zorro pick up from the second row of data which has been dupilcated so effectively I get all data days.


This would necessitate getting back to pandas to duplicate the first row in approx 1900 files. Even more to come... and then loading those csv files to Zorro t6.

I do not want to do the above so waiting for any possible resolution.
Posted By: Petra

Re: CSVToHistory always skips the first row from CSV file - 04/30/22 12:47

Waiting wont fix your problem. Fixing will fix your problem.

Your CSV file produces no missing first row here, so the CSV is not the problem. So you must now examine other possible reasons. Maybe some bug in your script?

void main() {
dataParse(1,"+%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6","History\\63MOONS_eod.csv");
dataSaveCSV(1,"%Y-%m-%d %H:%M:%S,f3,f1,f2,f4,f6","History\\out.csv");
}
Posted By: HamzaAhmed

Re: CSVToHistory always skips the first row from CSV file - 05/01/22 09:59

Indeed your script works. Thank you so much for your help. Provides a fresh perspective of the thought process for Zorro.

The script I am using for conversion is the inbuilt CSVToHistory.c available from the Script dropdown on Zorro UI. The lines that I edited are to uncomment #define SPLIT_YEARS and the string Format for my CSV. Here is the entire script

Code
////////////////////////////////////////////////
// Convert price history from .csv to .t6
// The Format string determines the CSV format (see examples)
////////////////////////////////////////////////

#define SPLIT_YEARS	// split into separate years
//#define FIX_ZONE	-1 // add a time zone difference, f.i. for converting CST -> EST

/* T6 Target format:
	DATE	time;	
	float fHigh, fLow;	// f1,f2
	float fOpen, fClose;	// f3,f4	
	float fVal, fVol;		// f5,f6
*/

// HISTDATA line format: "20100103 170000;1.430100;1.430400;1.430100;1.430400;0"
//string Format = "+%Y%m%d %H%M%S;f3;f1;f2;f4";

// YAHOO line format "2015-05-29,43.45,43.59,42.81,42.94,10901500,42.94"
//string Format = "%Y-%m-%d,f3,f1,f2,f4,f6,f5"; // unadjusted

// TRADESTATION line format "06/30/2016,17:00:00,2086.50,2086.50,2086.50,2086.50,319,0"
//string Format = "+%m/%d/%Y,%H:%M:%S,f3,f1,f2,f4,f6,f5";

// STK line format "12/23/2016,2300.00,SPY, 225.63, 225.68, 225.72, 225.62,1148991"
//string Format = "+-%m/%d/%Y,%H%M,,f3,f4,f1,f2,f6";

// CHRIS_ICE line format: Date,Open,High,Low,Settle,Change,Wave,Volume,...
// 2020-04-08,10.34,10.46,10.22,10.37,-0.01,10.32,54520.0,268936.0,4008.0,50.0,500.0
//string Format = "%Y-%m-%d,f3,f1,f2,f4,,,f6";

// MKTS Daily line format: Date,Open,High,Low,Close,Volume
// 02/28/2020,108.4,110,107.475,107.575,44239
string Format = "+%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


function main()
{
	string InName = file_select("History","CSV file\0*.csv\0\0");
	if(!InName) return quit("No file"); 
	int Records = dataParse(1,Format,InName);
	printf("\n%d lines read",Records);
#ifdef FIX_ZONE
	int i;
	for(i=0; i<Records; i++)
		dataSet(1,i,0,dataVar(1,i,0)+FIX_ZONE/24.);
#endif
#ifndef SPLIT_YEARS
	string OutName = strx(InName,".csv",".t6");
	if(Records) dataSave(1,OutName);
	printf("\n%s",OutName);		
#else
	int i, Start = 0, Year, LastYear = 0;
	for(i=0; i<Records; i++) {
		Year = atoi(strdate("%Y",dataVar(1,i,0)));
		if(!LastYear) LastYear = Year;
		if(i == Records-1) { // end of file
			LastYear = Year; Year = 0;
		}
		if(Year != LastYear) {
			string OutName = strf("%s_%4i.t6",strxc(InName,'.',0),LastYear);
			printf("\n%s",OutName);		
			dataSave(1,OutName,Start,i-Start);
			Start = i;
			LastYear = Year;
		}
	}
#endif
}


On closer inspection the dataParse function which uses the Format string is same in your script and CSVToHistory.c. The difference lies in the way data is saved.

Interestingly, if I comment out the #define SPLIT_YEARS to produce a single 63MOONS_eod.t6 file then the first row 2006-11-02 is available. Commenting out and uncommenting both read 3815 lines; thus dataParse does its job correctly. dataSave in case of a single t6 file does its job correctly.

The issue is when I want t6 files split into separate years which is
Code
dataSave(1,OutName,Start,i-Start);
.Now on further debugging through printf I diagnosed the problematic variable to be i therefore I modified the CSVToHistory.c script to add a line and end my woes.

This is the final script that I am using. I do not know whether what I have done is blasphemy according to Zorro standards but it works for my case. Thank Almighty I do not have to panda through 1900 files.

Code
////////////////////////////////////////////////
// Convert price history from .csv to .t6
// The Format string determines the CSV format (see examples)
////////////////////////////////////////////////

#define SPLIT_YEARS	// split into separate years
//#define FIX_ZONE	-1 // add a time zone difference, f.i. for converting CST -> EST

/* T6 Target format:
	DATE	time;	
	float fHigh, fLow;	// f1,f2
	float fOpen, fClose;	// f3,f4	
	float fVal, fVol;		// f5,f6
*/

// HISTDATA line format: "20100103 170000;1.430100;1.430400;1.430100;1.430400;0"
//string Format = "+%Y%m%d %H%M%S;f3;f1;f2;f4";

// YAHOO line format "2015-05-29,43.45,43.59,42.81,42.94,10901500,42.94"
//string Format = "%Y-%m-%d,f3,f1,f2,f4,f6,f5"; // unadjusted

// TRADESTATION line format "06/30/2016,17:00:00,2086.50,2086.50,2086.50,2086.50,319,0"
//string Format = "+%m/%d/%Y,%H:%M:%S,f3,f1,f2,f4,f6,f5";

// STK line format "12/23/2016,2300.00,SPY, 225.63, 225.68, 225.72, 225.62,1148991"
//string Format = "+-%m/%d/%Y,%H%M,,f3,f4,f1,f2,f6";

// CHRIS_ICE line format: Date,Open,High,Low,Settle,Change,Wave,Volume,...
// 2020-04-08,10.34,10.46,10.22,10.37,-0.01,10.32,54520.0,268936.0,4008.0,50.0,500.0
//string Format = "%Y-%m-%d,f3,f1,f2,f4,,,f6";

// MKTS Daily line format: Date,Open,High,Low,Close,Volume
// 02/28/2020,108.4,110,107.475,107.575,44239
string Format = "+%Y-%m-%d %H:%M:%S,,f3,f1,f2,f4,f6";


function main()
{
	string InName = file_select("History","CSV file\0*.csv\0\0");
	if(!InName) return quit("No file"); 
	int Records = dataParse(1,Format,InName);
	printf("\n%d lines read",Records);
#ifdef FIX_ZONE
	int i;
	for(i=0; i<Records; i++)
		dataSet(1,i,0,dataVar(1,i,0)+FIX_ZONE/24.);
#endif
#ifndef SPLIT_YEARS
	string OutName = strx(InName,".csv",".t6");
	if(Records) dataSave(1,OutName);
	printf("\n%s",OutName);		
#else
	int i, Start = 0, Year, LastYear = 0;
	for(i=0; i<Records; i++) {
		Year = atoi(strdate("%Y",dataVar(1,i,0)));
		if(!LastYear) LastYear = Year;
		if(i == Records-1) { // end of file
			LastYear = Year; Year = 0;
		}
		if(Year != LastYear) {
			string OutName = strf("%s_%4i.t6",strxc(InName,'.',0),LastYear);
			printf("\n%s",OutName);	
			printf("\n i %i",i);
			printf("\n Start %i",Start);
			printf("\n i-Start %i",i-Start);
			if(Year == 0){
				i +=1; 
			}
			dataSave(1,OutName,Start,i-Start);
			Start = i;
			LastYear = Year;
		}
	}
#endif
}


Well I just cannot wait to get a real debugger on the Zorro editor experience. Gamestop editor SED has it; then what about Zorro.

Can I read the source of the inbuilt functions in Zorro to see what they are doing like dataSave, dataSaveCSV, atoi etc?

The ability to set Breakpoints and Single Step watching the variables change values would cure my Sorrow oh dear Zorro.


Posted By: Petra

Re: CSVToHistory always skips the first row from CSV file - 05/01/22 10:26

You can use the VC++ debugger wich is way better than the old SED debugger.

I will give the info to the developers that the #define split_years in that script was possibly not fully tested...
Posted By: HamzaAhmed

Re: CSVToHistory always skips the first row from CSV file - 05/01/22 10:58

I do not know C++ which has its own quirks and additionally I believe a Zorro S subscription is required.

I am comfy with Lite-C, though I have minimal experience with C I can poke my fingers around and try to interpret the online manual to the best of my understanding.

What is needed is a decent free IDE like Visual Studio Code, to go to definition F12, move around files etc (which I have achieved ) and the ability to set and hit breakpoints etc ( which is of course a dream).

Thanks Petra.
Posted By: Zheka

Re: CSVToHistory always skips the first row from CSV file - 05/01/22 12:10

watch() with it different options will cover most of your debugging needs.
Posted By: HamzaAhmed

Re: CSVToHistory always skips the first row from CSV file - 05/01/22 19:26

Thanks Zheka. It is a pleasure to learn new techniques.
Posted By: HamzaAhmed

Re: CSVToHistory always skips the first row from CSV file - 05/15/22 15:50

It is nice to see the results of this thread incorporated to the CSVToHistory script in the version of the Zorro just released. I'm bumping up the my Zorro version too.
© 2024 lite-C Forums