Bug in plotHeatmap

Posted By: Zheka

Bug in plotHeatmap - 04/09/22 11:42

C memory layout convention is raw-major, but this piece of code in plotHeatmap() assumes column-major:
Code
for(i=0; i<Cols; i++)
	  for(j=0; j<Rows; j++) {
		//print(TO_ANY,"\nHeat: Col=%i Raw=%i %.4f",i,j,Data[i*Rows+j]);
		plotGraph("Heat",i+1,-j-1,SQUARE|STATS,color(Data[i*Rows+j]*100./Scale,BLUE,RED,0,0));
}
accessing incorrect elements of Data
Posted By: Petra

Re: Bug in plotHeatmap - 04/13/22 16:33

Maybe you meant row major, not raw major? Anyway I see neither major in that code. Not even a captain. Just old linear memory.
Posted By: jcl

Re: Bug in plotHeatmap - 04/13/22 16:44

The term "row major" refers not to C memory, but to a 2D matrix definition. If the 'Rows' and 'Cols' confused you, replace them simply with 'x' and 'y'.

In C you're normally using the inner loop for going through rows and the outer loop for going through columns. That's for speed and cache reasons.
Posted By: Zheka

Re: Bug in plotHeatmap - 04/13/22 20:19

The default assumption that a user is 'wrong until proven otherwise' is not constructive and wastes time. Mocking also annoys quite a bit.
Originally Posted by jcl
The term "row major" refers not to C memory, but to a 2D matrix definition.
No, it refers to a memory addressing order for multidimensional arrays, and it is language specific.

https://en.wikipedia.org/wiki/Row-_and_column-major_order

Different programming languages handle this in different ways. In C, multidimensional arrays are stored in row-major order, and the array indexes are written row-first (i.e. A[r][c] = A[r*Cols+c])
On the other hand, in Fortran, arrays are stored in memory in column-major order ( while the array indexes are still written row-first)

Originally Posted by jcl
In C you're normally using the inner loop for going through rows and the outer loop for going through columns. That's for speed and cache reasons.
Its the other way around: accessing the first element of a row(outer loop) loads the whole next/contiguous block of memory to the cache, and so subsequent iteration over elements in columns (inner loop) happens very fast.

https://stackoverflow.com/questions/997212/fastest-way-to-loop-through-a-2d-array

Accordingly, "Data[i*Rows+j] " accesses wrong elements of an array which was stored - by convention - as Data[i*Cols+j]
Posted By: jcl

Re: Bug in plotHeatmap - 04/13/22 21:00

Well, the articles that you linked are correct, as far as I see. So you now only need to read them.

Probably you're misled by all those rows and columns. In a computer, memory is one dimensional. You're always adressing it so that subsequent accesses hit subsequent locations. As you see in the heatmap code. That's valid for all languages because of cache efficiency, and is unrelated to "row major" or other military ranks.

If you have any other question about array adressing or caches, just ask.
Posted By: Petra

Re: Bug in plotHeatmap - 04/14/22 07:44

I think you simply confused this:

a) Data[i][j] // Beware of the major!

with this:

b) Data[n*i+j] // no major here...

Am I right?
Posted By: Zheka

Re: Bug in plotHeatmap - 04/14/22 11:26

The code in plotHeatmap() traverses an input array "efficiently", but incorrectly - that was THE point of the OP.

It works for a 'square' NxN matrix, but not in the general case, when Rows!=Cols (and the reason for that is traversing an array in a column-major way, so to speak, by blocks Rows-wide, rather than Cols-wide).

Run this code:
Code
#include <profile.c>

#define Rows 3
#define Cols 8

int main() {
		
	var Rets[Rows][Cols];  // address of a element is Rets [ i*Cols+j]  
	
	int DoW=0;
	int Hr=0;
	
	var i=0.01;
	
	for (DoW=0; DoW<Rows; DoW++) {
		for (Hr=0;Hr<Cols;Hr++)	{		
			Rets[DoW][Hr] =i;		// writing to an array happens at Rets[Dow*Cols +Hr]	-  i.e. by blocks of 8 (hourly) values per DoW
			i+=0.01;
		}		
	}
	
	plotHeatmap("DoWbyHr",Rets,Rows,Cols); //  produces wrong weird map - because retrieval/plotting happens from Rets[Hr*Rows+Dow] - i.e. by blocks of 3 (hourly) values
	//plotHeatmap("DoWbyHr",Rets,Cols,Rows); // exchange Rows and Cols params to get a visually correct heatmap - though looking transposed
	
return 1;
}
Posted By: jcl

Re: Bug in plotHeatmap - 04/14/22 15:50

Yes, I see the problem you're having, in fact it's two problems. But since Petra's and my explanation attempts were unsuccessful so far, here's my last try and then I'll leave it at that.

The first problem is that since a heatmap looks 2 dimensional, you seem to assume it's based on a 2D array. It's not. You can see in the code above that plotHeatmap takes a 1d array. It's Data[rows*cols], not Data[rows][cols]. So your code will have a problem. The total memory size is correct, but you got wrong rows and columns.

Second problem is that you assume that a heatmap is written horizontally row by row. It's not. Non-square heatmaps are normally visualizing something over time, and the time axis is normally horizontal, so we want all elements belonging to a certain point in time to be written vertically in adjacent locations in the heatmap.

Check out the MVO script on Financial Hacker. There you can see how to generate a very long, non-square heatmap.


© 2024 lite-C Forums