Posted By: richs
Puzzling behavior when converting CSV to Zorro .t8 file - 03/04/20 00:51
I'm trying to convert options contract information into a Zorro .t8 file and it's turned into a brainteaser.
The following code is patterned after the CSVtoOptions.c code that ships with Zorro.
My code uses dataParse() to read the input file and then calls dataSaveCSV() to write the parsed values out to another file. The output file contains the expected values, so I believe my dataParse() format string is reasonable.
The code then goes on to read the values for the first row of the data set and store them into a CONTRACT.
Next, the code prints the values that were read from the data set row. This is followed by a second print statement to print what should be the same values (with the exception of option_type having been converted from a 1 character string to a long). Some of these values are being read from the CONTRACT and others are being read directly from the input row. However, the values from the second print statement do not match the values from the first print statement.
The first three lines of the .csv input file are:
Here are the first two lines from the file that I wrote back out using dataSaveCSV():
Here is the content that was written to the Zorro log (with Verbose = 7):
Finally here is the Zorro script that I am running:
Any suggestions as to what I've done wrong will be much appreciated!
Thanks,
Rich
The following code is patterned after the CSVtoOptions.c code that ships with Zorro.
My code uses dataParse() to read the input file and then calls dataSaveCSV() to write the parsed values out to another file. The output file contains the expected values, so I believe my dataParse() format string is reasonable.
The code then goes on to read the values for the first row of the data set and store them into a CONTRACT.
Next, the code prints the values that were read from the data set row. This is followed by a second print statement to print what should be the same values (with the exception of option_type having been converted from a 1 character string to a long). Some of these values are being read from the CONTRACT and others are being read directly from the input row. However, the values from the second print statement do not match the values from the first print statement.
The first three lines of the .csv input file are:
Code
ticker,name,quote_date,option_symbol,expiration,strike,option_type,bid,ask,last,volume,open_interest,underlying_last,implied_volatility,delta,gamma,theta,vega DIA,SPDR Dow Jones Industrial Average ETF Trust,2002-12-31,DAZ030118C00056000,2003-01-18,56.000000,C,27.4000,27.9000,28.3000,0,1,83.5100000,0.928498,0.983668,0.248371,0,0 DIA,SPDR Dow Jones Industrial Average ETF Trust,2002-12-31,DAZ030118P00056000,2003-01-18,56.000000,P,0.0000,0.0500,0.0000,0,98,83.5100000,0.769044,-0.005513,0.116097,0,0
Here are the first two lines from the file that I wrote back out using dataSaveCSV():
Code
2002-12-31,20030118,56.00000,C,27.40000,27.90000,0.00000,83.51000,0.98367 2002-12-31,20030118,56.00000,P,0.00000,0.05000,0.00000,83.51000,-0.00551
Here is the content that was written to the Zorro log (with Verbose = 7):
Code
Parse History\DIA_options_2002.csv.. DIA,SPDR Dow Jones Industrial Average ETF Trust,2002-12-31,DAZ030118C00056000,2003-01-18,56.000000,C,27.4000,27.9000,28.3000,0,1,83.5100000,0.928498,0.983668,0.248371,0,0 37621.00000000 20030118 56.00000 C 27.40000 27.90000 0.00000 83.51000 0.98367 DIA,SPDR Dow Jones Industrial Average ETF Trust,2002-12-31,DAZ030118P00056000,2003-01-18,56.000000,P,0.0000,0.0500,0.0000,0,98,83.5100000,0.769044,-0.005513,0.116097,0,0 37621.00000000 20030118 56.00000 P 0.00000 0.05000 0.00000 83.51000 -0.00551 43934 records Save DIA_options_2002.out.. 43934 recordsDIA_options_2002.csv: Converting 43934 records time: 2002-12-31, expiry=20030118, strike: 56.000000, optiontype: C, bid: 27.400000, ask: 27.900000, volume: 0.000000, underlying: 83.510002, delta: 0.983668 time: 2002-12-31, expiry=20030118, strike: 0.000000, optiontype: 1104884531, bid: 27.900000, ask: 0.000000, volume: 83.510002, underlying: 0.983668, delta: 0.000000 time: 2002-12-31, expiry=20030118, strike: 56.000000, optiontype: P, bid: 0.000000, ask: 0.050000, volume: 0.000000, underlying: 83.510002, delta: -0.005513 time: 2002-12-31, expiry=20030118, strike: 0.000000, optiontype: 0, bid: 0.050000, ask: 0.000000, volume: 83.510002, underlying: -0.005513, delta: 0.000000
Finally here is the Zorro script that I am running:
Code
////////////////////////////////////////////////////////////////////////////// // Convert Historical Options Data contract data from .csv to .t8 // The Format string specifies how to extract the necessary fields // from the CSV format ////////////////////////////////////////////////////////////////////////////// #include <default.c> #include <stdio.h> // Historical option data line format: // ticker,name,quote_date,option_symbol,expiration,strike,option_type,bid,ask,last,volume,open_interest,underlying_last,implied_volatility,delta,gamma,theta,vega // SPY,SPDR S&P 500 ETF Trust,2020-02-28,SPY200228C00250000,2020-02-28,250.000000,C,46.9200,47.8000,38.4100,35,1,296.2400000,10.4975,0,0,0,0 // extract quote_date,expiration,strike,option_type,bid,ask,volume,underlying_last,delta string Format = ",,%Y-%m-%d,,i,f,s,f,f,,f,,f,,f"; int exists(string fname) { FILE *file; if (file = fopen(fname, "r")) { fclose(file); return 1; } return 0; } void convertOneCsv(string inName, string outName, string format) { // first step: close the data set to make sure there is no data // left over from earlier conversions dataNew(1, 0, 0); // second step: parse the CSV file into a dataset if (!exists(strf("History/%s", inName))) return; int nRecords = dataParse(1, format, inName); dataSaveCSV(1, "%Y-%m-%d,i,f,s,f,f,f,f,f", "DIA_options_2002.out"); if (!nRecords) return; printf("%s: Converting %d records\n", inName, nRecords); // third step: convert the raw data to the final CONTRACT format int i; for (i=0; i < nRecords; i++) { CONTRACT* O = dataAppendRow(2,9); O->time = dataVar(1,i,0); O->Expiry = dataInt(1,i,1); O->fStrike = dataVar(1,i,2); string PC = dataStr(1,i,3); O->Type = ifelse(*PC == 'P', PUT, CALL); O->fBid = dataVar(1,i,4); O->fAsk = dataVar(1,i,5); O->fVol = dataVar(1,i,6); O->fUnl = dataVar(1,i,7); O->fVal = dataVar(1,i,8); // delta if (!progress(100*i/nRecords, 0)) break; // show a progress bar if (i < 2) { printf("time: %s, expiry=%d, strike: %f, optiontype: %s, bid: %f, ask: %f, volume: %f, underlying: %f, delta: %f\n", strdate("%Y-%m-%d", dataVar(1,i,0)), // time dataInt(1,i,1), // Expiry dataVar(1,i,2), // strike dataStr(1,i,3), // option type dataVar(1,i,4), // bid dataVar(1,i,5), // ask dataVar(1,i,6), // volume dataVar(1,i,7), // underlying last dataVar(1,i,8)); // delta printf("time: %s, expiry=%d, strike: %f, optiontype: %d, bid: %f, ask: %f, volume: %f, underlying: %f, delta: %f\n", strdate("%Y-%m-%d", O->time), // time O->Expiry, // Expiry O->fStrike, // strike O->Type, // option type O->fBid, // bid dataVar(1,i,5), // ask dataVar(1,i,6), // volume dataVar(1,i,7), // underlying last dataVar(1,i,8)); // delta } else { return; } } } function main() { string inNameFmt = "%s_options_%i.csv"; string outNameFmt = "%s.t8"; int firstYear = 2002; int lastYear = 2002; string tickers[20]; tickers[0] = "DIA"; tickers[1] = ""; set(LOGFILE); Verbose = 7; int i, year; string inName, outName; for (i=0; strcmp(tickers[i], "") != 0; i++) { // close the target data set (handle == 2) to make sure there is no data // left over from earlier conversions dataNew(2, 0, 0); for (year=firstYear; year <= lastYear; year++) { inName = strf(inNameFmt, tickers[i], year); outName = strf(outNameFmt, tickers[i], year); convertOneCsv(inName, outName, Format); } // sort the records in descending order by date dataSort(2); // save the converted data dataSave(2, strf(outNameFmt, tickers[i])); } }
Any suggestions as to what I've done wrong will be much appreciated!
Thanks,
Rich