Gamestudio Links
Zorro Links
Newest Posts
Buy A8 Pro Version
by jcl. 01/24/26 07:40
ZorroGPT
by TipmyPip. 01/24/26 07:26
Zorro version 3.0 prerelease!
by jcl. 01/23/26 14:47
C++ direct start requires Zorro S
by jcl. 01/23/26 14:29
MRC.c and WFO
by 11honza11. 01/22/26 10:33
Bar missing bug?
by jcl. 01/19/26 10:47
AUM Magazine
Latest Screens
Rocker`s Revenge
Stug 3 Stormartillery
Iljuschin 2
Galactic Strike X
Who's Online Now
3 registered members (TipmyPip, Quantum, 1 invisible), 24,236 guests, and 7 spiders.
Key: Admin, Global Mod, Mod
Newest Members
mayarik, Castromangos, Quantum, stephensdeborah, promfast
19194 Registered Users
Previous Thread
Next Thread
Print Thread
Rate Thread
Page 13 of 13 1 2 11 12 13
TwinPulse Fractal Accord [Re: TipmyPip] #489118
5 hours ago
5 hours ago
Joined: Sep 2017
Posts: 184
TipmyPip Online OP
Member
TipmyPip  Online OP
Member

Joined: Sep 2017
Posts: 184

TwinPulse Fractal Accord is a two timeframe learning trader that blends fractal style pattern sensing with cooperative self tuning. On each bar it builds one price stream at a fast timeframe and another at a slower timeframe. From each stream it extracts three signals: a fractal roughness estimate, a linear trend slope, and a volatility of log returns. Those six signals feed a built in classifier that produces separate long and short tendencies. The trade engine compares them, turns the balance into a trade direction, and sizes exposure using adaptive leverage controls. Trades are also managed with a time based exit so positions do not linger past the learned holding duration.

Instead of fixing the configuration, the strategy treats key settings as a team of bandit agents. Each agent controls one knob and can pick from a small list of discrete settings called arms. Agents cover the fast timeframe, the slow offset, feature window lengths on both timeframes, leverage scaling, maximum leverage, the prediction threshold, and the holding duration. At the start of an episode the agents choose arms, the strategy trades with that joint configuration, and when it returns to flat it measures the episode result. The reward is normalized by holding time and clipped, so one unusually large outcome does not dominate learning.

Exploration is guided by an upper confidence rule. Each arm gets a score made from its current value estimate plus an uncertainty bonus that is larger when the arm has fewer samples. This encourages careful exploration early and more consistent exploitation later. A small random explore chance remains to keep coverage and to cope with regime changes, data quirks, and surprise shifts in market behavior.

Communication between agents is implemented through coordination tables that store how two choices perform together beyond their individual quality. The most important table links the fast timeframe agent with the slow offset agent. The slow offset agent scores each candidate using its own estimate plus a gated pair term tied to the chosen fast arm. Symmetry is added by letting the fast timeframe agent evaluate its candidates while anticipating the best responding slow offset. After both are selected, a short negotiation step can reconsider the fast timeframe given the chosen offset, improving coherence between the two lenses.

The same pattern is applied to other dependencies. Feature window agents communicate so fractal windows and slope windows on each timeframe tend to match a compatible scale, improving stability of the feature estimates. Risk control agents communicate so leverage scaling aligns with the prediction threshold, and maximum leverage aligns with holding duration, which shapes exposure time.

Pair influence is gated. Pair terms begin weak and grow only after enough joint trials, preventing noisy early episodes from locking in fragile coordination. Pair learning is residual: each pair table updates only the extra benefit or harm not explained by the two single arm estimates, reducing double counting and sharpening credit assignment.

Finally, a safety constraint blocks extreme combinations that would demand excessive warmup history or unstable feature computation, reducing wasted episodes and keeping training productive. Over many episodes the strategy becomes a small community of specialists that explore, share compatibility knowledge, and gradually converge on coordinated multi timeframe and risk settings that support steadier signals and cleaner trade timing overall, even when conditions drift.

Code
// ============================================================================
// Fractal Learner (Strategy 3) - RL Bandits for Parameters + 2TF ML (RETURNS)
// File: Fractal_EURUSD_2TF_RL_Bandits_v2_COMM_UCB_SYM_RES_SAFE.c (Zorro / lite-C)
//
// ADDED FEATURES (as requested):
// 1) Symmetric TF communication (TF1 "listens" too) via best-response scoring.
// 2) UCB exploration bonus (single + pair) instead of plain epsilon-greedy.
// 3) Reward normalization (per HoldBars) + clipping.
// 4) Pair-gating: pair influence ramps up only after enough samples.
// 5) More dependencies communicated:
//      - LevScale ? PredThr
//      - MaxLev   ? HoldBars
// 6) Pair-table affects both directions (1-step coordinate descent negotiation).
// 7) Residual pair tables (prevents double counting of single Q):
//      QpairRes(a,b) updated with reward - (Q_A(a)+Q_B(b)+QpairRes(a,b))
// 8) Safe constraints to avoid nonsense combos (reduces wasted episodes)
//
// lite-C safe:
// - No ternary operator (uses ifelse())
// - Header uses multiple file_append calls
// - strf format string is ONE literal
// ============================================================================

#define NPAR 12
#define MAXARMS 64

// Exploration / learning
#define EPSILON 0.10
#define ALPHA   0.10

// Communication strength (base)
#define LAMBDA_PAIR 0.30

// UCB constants
#define C_UCB   0.40
#define CP_UCB  0.40

// Pair gating (minimum samples before full lambda)
#define NMIN_PAIR 25

// Reward normalization/clipping
#define REWARD_CLIP 1000.0

// Safe constraint limit (keeps TF2 + windows from exploding warmup)
#define SAFE_LIMIT 2500

#define P_INT 0
#define P_VAR 1

// Parameter indices (RL-controlled)
#define P_TF1        0
#define P_TF2D       1
#define P_FDLEN1     2
#define P_SLOPELEN1  3
#define P_VOLLEN1    4
#define P_FDLEN2     5
#define P_SLOPELEN2  6
#define P_VOLLEN2    7
#define P_LEVSCALE   8
#define P_MAXLEV     9
#define P_PREDTHR    10
#define P_HOLDBARS   11

// -----------------------------
// RL storage
// -----------------------------
string ParName[NPAR];

int ParType[NPAR] =
{
	P_INT,  // TF1
	P_INT,  // TF2d
	P_INT,  // FDLen1
	P_INT,  // SlopeLen1
	P_INT,  // VolLen1
	P_INT,  // FDLen2
	P_INT,  // SlopeLen2
	P_INT,  // VolLen2
	P_VAR,  // LevScale
	P_VAR,  // MaxLev
	P_VAR,  // PredThr
	P_INT   // HoldBars
};

var ParMin[NPAR];
var ParMax[NPAR];
var ParStep[NPAR];

var Q[NPAR][MAXARMS];
int Ncnt[NPAR][MAXARMS];
int ArmsCount[NPAR];
int CurArm[NPAR];

// Totals for UCB (avoid summing every time)
int TotCnt[NPAR];

// -----------------------------
// Pairwise "communication" RESIDUAL tables
// (all are QpairRes; keep Npair and TotPair for gating+UCB)
// -----------------------------

// TF1 arm × TF2d arm
var Qpair_TF[MAXARMS][MAXARMS];
int Npair_TF[MAXARMS][MAXARMS];
int TotPair_TF;

// FDLen1 arm × SlopeLen1 arm
var Qpair_FD1SL1[MAXARMS][MAXARMS];
int Npair_FD1SL1[MAXARMS][MAXARMS];
int TotPair_FD1SL1;

// FDLen2 arm × SlopeLen2 arm
var Qpair_FD2SL2[MAXARMS][MAXARMS];
int Npair_FD2SL2[MAXARMS][MAXARMS];
int TotPair_FD2SL2;

// LevScale arm × PredThr arm
var Qpair_LS_PT[MAXARMS][MAXARMS];
int Npair_LS_PT[MAXARMS][MAXARMS];
int TotPair_LS_PT;

// MaxLev arm × HoldBars arm
var Qpair_ML_HB[MAXARMS][MAXARMS];
int Npair_ML_HB[MAXARMS][MAXARMS];
int TotPair_ML_HB;

// -----------------------------
// Utility
// -----------------------------
int calcArms(var mn, var mx, var stp)
{
	if(stp <= 0) return 1;
	int n = (int)floor((mx - mn)/stp + 1.000001);
	if(n < 1) n = 1;
	if(n > MAXARMS) n = MAXARMS;
	return n;
}

var armValue(int p, int a)
{
	var v = ParMin[p] + (var)a * ParStep[p];
	if(v < ParMin[p]) v = ParMin[p];
	if(v > ParMax[p]) v = ParMax[p];
	if(ParType[p] == P_INT) v = (var)(int)(v + 0.5);
	return v;
}

void initParNames()
{
	ParName[P_TF1]       = "TF1";
	ParName[P_TF2D]      = "TF2d";
	ParName[P_FDLEN1]    = "FDLen1";
	ParName[P_SLOPELEN1] = "SlopeLen1";
	ParName[P_VOLLEN1]   = "VolLen1";
	ParName[P_FDLEN2]    = "FDLen2";
	ParName[P_SLOPELEN2] = "SlopeLen2";
	ParName[P_VOLLEN2]   = "VolLen2";
	ParName[P_LEVSCALE]  = "LevScale";
	ParName[P_MAXLEV]    = "MaxLev";
	ParName[P_PREDTHR]   = "PredThr";
	ParName[P_HOLDBARS]  = "HoldBars";
}

// UCB bonus: c * sqrt( ln(1+Tot) / (1+n) )
var ucbBonus(int tot, int n, var c)
{
	var lnT = log(1 + (var)tot);
	return c * sqrt(lnT / (1 + (var)n));
}

// Pair gating: lambda_eff = lambda * min(1, nPair/nMin)
var lambdaEff(int nPair)
{
	var frac = (var)nPair / (var)NMIN_PAIR;
	if(frac > 1) frac = 1;
	if(frac < 0) frac = 0;
	return LAMBDA_PAIR * frac;
}

// -----------------------------
// Single-arm selection with UCB + epsilon explore
// -----------------------------
int bestArm_UCB(int p)
{
	int a, best = 0;
	var bestScore = Q[p][0] + ucbBonus(TotCnt[p], Ncnt[p][0], C_UCB);

	for(a=1; a<ArmsCount[p]; a++)
	{
		var score = Q[p][a] + ucbBonus(TotCnt[p], Ncnt[p][a], C_UCB);
		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_UCB(int p)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[p]);
	return bestArm_UCB(p);
}

// Update single Q
void updateArm(int p, int a, var reward)
{
	Q[p][a] = Q[p][a] + ALPHA*(reward - Q[p][a]);
	Ncnt[p][a] += 1;
	TotCnt[p] += 1;
}

// -----------------------------
// Residual pair update: QpairRes += alpha * (r - (QA + QB + QpairRes))
// -----------------------------
void updatePairRes_TF(int a1, int a2, var reward)
{
	var pred = Q[P_TF1][a1] + Q[P_TF2D][a2] + Qpair_TF[a1][a2];
	Qpair_TF[a1][a2] = Qpair_TF[a1][a2] + ALPHA*(reward - pred);
	Npair_TF[a1][a2] += 1;
	TotPair_TF += 1;
}

void updatePairRes_FD1SL1(int a1, int a2, var reward)
{
	var pred = Q[P_FDLEN1][a1] + Q[P_SLOPELEN1][a2] + Qpair_FD1SL1[a1][a2];
	Qpair_FD1SL1[a1][a2] = Qpair_FD1SL1[a1][a2] + ALPHA*(reward - pred);
	Npair_FD1SL1[a1][a2] += 1;
	TotPair_FD1SL1 += 1;
}

void updatePairRes_FD2SL2(int a1, int a2, var reward)
{
	var pred = Q[P_FDLEN2][a1] + Q[P_SLOPELEN2][a2] + Qpair_FD2SL2[a1][a2];
	Qpair_FD2SL2[a1][a2] = Qpair_FD2SL2[a1][a2] + ALPHA*(reward - pred);
	Npair_FD2SL2[a1][a2] += 1;
	TotPair_FD2SL2 += 1;
}

void updatePairRes_LS_PT(int a1, int a2, var reward)
{
	var pred = Q[P_LEVSCALE][a1] + Q[P_PREDTHR][a2] + Qpair_LS_PT[a1][a2];
	Qpair_LS_PT[a1][a2] = Qpair_LS_PT[a1][a2] + ALPHA*(reward - pred);
	Npair_LS_PT[a1][a2] += 1;
	TotPair_LS_PT += 1;
}

void updatePairRes_ML_HB(int a1, int a2, var reward)
{
	var pred = Q[P_MAXLEV][a1] + Q[P_HOLDBARS][a2] + Qpair_ML_HB[a1][a2];
	Qpair_ML_HB[a1][a2] = Qpair_ML_HB[a1][a2] + ALPHA*(reward - pred);
	Npair_ML_HB[a1][a2] += 1;
	TotPair_ML_HB += 1;
}

// -----------------------------
// Pair-aware selection with UCB + gating + residual pair
// score(b) = Q_B(b)+UCB + lambda_eff*QpairRes(a,b) + pairUCB
// -----------------------------
int bestArm_TF2d_given_TF1(int armTF1)
{
	int a, best = 0;

	int n0 = Npair_TF[armTF1][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_TF[armTF1][0] + ucbBonus(TotPair_TF, n0, CP_UCB);

	var bestScore =
		Q[P_TF2D][0] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_TF2D]; a++)
	{
		int n = Npair_TF[armTF1][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_TF[armTF1][a] + ucbBonus(TotPair_TF, n, CP_UCB);

		var score =
			Q[P_TF2D][a] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_TF2d_given_TF1(int armTF1)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_TF2D]);
	return bestArm_TF2d_given_TF1(armTF1);
}

// TF1 symmetric best-response score:
// scoreTF1(a1) = Q_TF1(a1)+UCB + max_a2( lambda_eff*QpairRes(a1,a2) + pairUCB )
int bestArm_TF1_symmetric()
{
	int a1, best = 0;

	// compute bestScore for a1=0
	var bestPairTerm0 = 0;
	int a2;
	for(a2=0; a2<ArmsCount[P_TF2D]; a2++)
	{
		int n = Npair_TF[0][a2];
		var lam = lambdaEff(n);
		var term = lam*Qpair_TF[0][a2] + ucbBonus(TotPair_TF, n, CP_UCB);
		if(a2==0) bestPairTerm0 = term;
		else if(term > bestPairTerm0) bestPairTerm0 = term;
	}

	var bestScore =
		Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + bestPairTerm0;

	for(a1=1; a1<ArmsCount[P_TF1]; a1++)
	{
		var bestPairTerm = 0;
		for(a2=0; a2<ArmsCount[P_TF2D]; a2++)
		{
			int n2 = Npair_TF[a1][a2];
			var lam2 = lambdaEff(n2);
			var term2 = lam2*Qpair_TF[a1][a2] + ucbBonus(TotPair_TF, n2, CP_UCB);
			if(a2==0) bestPairTerm = term2;
			else if(term2 > bestPairTerm) bestPairTerm = term2;
		}

		var score =
			Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + bestPairTerm;

		if(score > bestScore)
		{
			bestScore = score;
			best = a1;
		}
	}
	return best;
}

int selectArm_TF1_symmetric()
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_TF1]);
	return bestArm_TF1_symmetric();
}

// Coordinate descent step: re-pick TF1 given TF2d
int bestArm_TF1_given_TF2d(int armTF2d)
{
	int a1, best = 0;

	int n0 = Npair_TF[0][armTF2d];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_TF[0][armTF2d] + ucbBonus(TotPair_TF, n0, CP_UCB);

	var bestScore =
		Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + pairB0;

	for(a1=1; a1<ArmsCount[P_TF1]; a1++)
	{
		int n = Npair_TF[a1][armTF2d];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_TF[a1][armTF2d] + ucbBonus(TotPair_TF, n, CP_UCB);

		var score =
			Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a1;
		}
	}
	return best;
}

// Generic pair selectors for the other 4 pairs (FD1-SL1, FD2-SL2, LS-PT, ML-HB)

int bestArm_SlopeLen1_given_FDLen1(int armFD1)
{
	int a, best = 0;

	int n0 = Npair_FD1SL1[armFD1][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_FD1SL1[armFD1][0] + ucbBonus(TotPair_FD1SL1, n0, CP_UCB);

	var bestScore =
		Q[P_SLOPELEN1][0] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_SLOPELEN1]; a++)
	{
		int n = Npair_FD1SL1[armFD1][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_FD1SL1[armFD1][a] + ucbBonus(TotPair_FD1SL1, n, CP_UCB);

		var score =
			Q[P_SLOPELEN1][a] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_SlopeLen1_given_FDLen1(int armFD1)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_SLOPELEN1]);
	return bestArm_SlopeLen1_given_FDLen1(armFD1);
}

int bestArm_SlopeLen2_given_FDLen2(int armFD2)
{
	int a, best = 0;

	int n0 = Npair_FD2SL2[armFD2][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_FD2SL2[armFD2][0] + ucbBonus(TotPair_FD2SL2, n0, CP_UCB);

	var bestScore =
		Q[P_SLOPELEN2][0] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_SLOPELEN2]; a++)
	{
		int n = Npair_FD2SL2[armFD2][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_FD2SL2[armFD2][a] + ucbBonus(TotPair_FD2SL2, n, CP_UCB);

		var score =
			Q[P_SLOPELEN2][a] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_SlopeLen2_given_FDLen2(int armFD2)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_SLOPELEN2]);
	return bestArm_SlopeLen2_given_FDLen2(armFD2);
}

int bestArm_PredThr_given_LevScale(int armLS)
{
	int a, best = 0;

	int n0 = Npair_LS_PT[armLS][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_LS_PT[armLS][0] + ucbBonus(TotPair_LS_PT, n0, CP_UCB);

	var bestScore =
		Q[P_PREDTHR][0] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_PREDTHR]; a++)
	{
		int n = Npair_LS_PT[armLS][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_LS_PT[armLS][a] + ucbBonus(TotPair_LS_PT, n, CP_UCB);

		var score =
			Q[P_PREDTHR][a] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_PredThr_given_LevScale(int armLS)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_PREDTHR]);
	return bestArm_PredThr_given_LevScale(armLS);
}

int bestArm_HoldBars_given_MaxLev(int armML)
{
	int a, best = 0;

	int n0 = Npair_ML_HB[armML][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_ML_HB[armML][0] + ucbBonus(TotPair_ML_HB, n0, CP_UCB);

	var bestScore =
		Q[P_HOLDBARS][0] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_HOLDBARS]; a++)
	{
		int n = Npair_ML_HB[armML][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_ML_HB[armML][a] + ucbBonus(TotPair_ML_HB, n, CP_UCB);

		var score =
			Q[P_HOLDBARS][a] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_HoldBars_given_MaxLev(int armML)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_HOLDBARS]);
	return bestArm_HoldBars_given_MaxLev(armML);
}

// -----------------------------
// Init ranges + tables
// -----------------------------
void initParamsRL()
{
	// Ranges roughly matching your old optimize() ranges/steps
	ParMin[P_TF1]        = 1;    ParMax[P_TF1]        = 3;    ParStep[P_TF1]        = 1;
	ParMin[P_TF2D]       = 1;    ParMax[P_TF2D]       = 11;   ParStep[P_TF2D]       = 1;

	ParMin[P_FDLEN1]     = 20;   ParMax[P_FDLEN1]     = 220;  ParStep[P_FDLEN1]     = 5;
	ParMin[P_SLOPELEN1]  = 20;   ParMax[P_SLOPELEN1]  = 200;  ParStep[P_SLOPELEN1]  = 5;
	ParMin[P_VOLLEN1]    = 20;   ParMax[P_VOLLEN1]    = 200;  ParStep[P_VOLLEN1]    = 5;

	ParMin[P_FDLEN2]     = 10;   ParMax[P_FDLEN2]     = 160;  ParStep[P_FDLEN2]     = 5;
	ParMin[P_SLOPELEN2]  = 10;   ParMax[P_SLOPELEN2]  = 140;  ParStep[P_SLOPELEN2]  = 5;
	ParMin[P_VOLLEN2]    = 10;   ParMax[P_VOLLEN2]    = 140;  ParStep[P_VOLLEN2]    = 5;

	ParMin[P_LEVSCALE]   = 2;    ParMax[P_LEVSCALE]   = 30;   ParStep[P_LEVSCALE]   = 1;
	ParMin[P_MAXLEV]     = 0.1;  ParMax[P_MAXLEV]     = 1.0;  ParStep[P_MAXLEV]     = 0.1;
	ParMin[P_PREDTHR]    = 0.0;  ParMax[P_PREDTHR]    = 0.20; ParStep[P_PREDTHR]    = 0.01;
	ParMin[P_HOLDBARS]   = 1;    ParMax[P_HOLDBARS]   = 30;   ParStep[P_HOLDBARS]   = 1;

	int p, a;
	for(p=0; p<NPAR; p++)
	{
		ArmsCount[p] = calcArms(ParMin[p], ParMax[p], ParStep[p]);
		CurArm[p] = 0;
		TotCnt[p] = 0;
		for(a=0; a<ArmsCount[p]; a++)
		{
			Q[p][a] = 0;
			Ncnt[p][a] = 0;
		}
	}

	// Init pair tables
	int i,j;
	TotPair_TF = 0;
	TotPair_FD1SL1 = 0;
	TotPair_FD2SL2 = 0;
	TotPair_LS_PT = 0;
	TotPair_ML_HB = 0;

	for(i=0; i<MAXARMS; i++)
	{
		for(j=0; j<MAXARMS; j++)
		{
			Qpair_TF[i][j] = 0;      Npair_TF[i][j] = 0;
			Qpair_FD1SL1[i][j] = 0;  Npair_FD1SL1[i][j] = 0;
			Qpair_FD2SL2[i][j] = 0;  Npair_FD2SL2[i][j] = 0;
			Qpair_LS_PT[i][j] = 0;   Npair_LS_PT[i][j] = 0;
			Qpair_ML_HB[i][j] = 0;   Npair_ML_HB[i][j] = 0;
		}
	}
}

// -----------------------------
// Safe constraints helper
// -----------------------------
int safeOK(int TF1_arm, int TF2d_arm, int FDLen2_arm, int SlopeLen2_arm, int VolLen2_arm)
{
	int TF1v = (int)armValue(P_TF1, TF1_arm);
	int TF2dv = (int)armValue(P_TF2D, TF2d_arm);
	int TF2v = TF1v + TF2dv;
	if(TF2v > 12) TF2v = 12;

	int FD2v = (int)armValue(P_FDLEN2, FDLen2_arm);
	int SL2v = (int)armValue(P_SLOPELEN2, SlopeLen2_arm);
	int VL2v = (int)armValue(P_VOLLEN2, VolLen2_arm);

	int mx = FD2v;
	if(SL2v > mx) mx = SL2v;
	if(VL2v > mx) mx = VL2v;

	// TF2 * max(window2) <= SAFE_LIMIT
	if(TF2v * mx > SAFE_LIMIT)
		return 0;
	return 1;
}

// -----------------------------
// COMMUNICATING parameter pick order (symmetric + extra pairs + coord descent)
// -----------------------------
void pickParams()
{
	int p;

	// 1) TF1 symmetric pick (anticipate TF2d best response)
	CurArm[P_TF1] = selectArm_TF1_symmetric();

	// 2) TF2d conditioned on TF1 (pair-aware)
	CurArm[P_TF2D] = selectArm_TF2d_given_TF1(CurArm[P_TF1]);

	// 3) Coordinate descent: re-pick TF1 given TF2d (tiny negotiation step)
	if(random(1) >= EPSILON) // keep some stochasticity
		CurArm[P_TF1] = bestArm_TF1_given_TF2d(CurArm[P_TF2D]);

	// 4) FD/Slope pair TF1
	CurArm[P_FDLEN1]    = selectArm_UCB(P_FDLEN1);
	CurArm[P_SLOPELEN1] = selectArm_SlopeLen1_given_FDLen1(CurArm[P_FDLEN1]);

	// 5) FD/Slope pair TF2
	CurArm[P_FDLEN2]    = selectArm_UCB(P_FDLEN2);
	CurArm[P_SLOPELEN2] = selectArm_SlopeLen2_given_FDLen2(CurArm[P_FDLEN2]);

	// 6) Remaining independent params first (we'll coordinate the extra pairs after)
	for(p=0; p<NPAR; p++)
	{
		if(p == P_TF1) continue;
		if(p == P_TF2D) continue;
		if(p == P_FDLEN1) continue;
		if(p == P_SLOPELEN1) continue;
		if(p == P_FDLEN2) continue;
		if(p == P_SLOPELEN2) continue;
		CurArm[p] = selectArm_UCB(p);
	}

	// 7) Extra pairs communication:
	// LevScale -> PredThr conditioned
	CurArm[P_LEVSCALE] = selectArm_UCB(P_LEVSCALE);
	CurArm[P_PREDTHR]  = selectArm_PredThr_given_LevScale(CurArm[P_LEVSCALE]);

	// MaxLev -> HoldBars conditioned
	CurArm[P_MAXLEV]   = selectArm_UCB(P_MAXLEV);
	CurArm[P_HOLDBARS] = selectArm_HoldBars_given_MaxLev(CurArm[P_MAXLEV]);

	// 8) Safe constraints: if violated, resample the most relevant arms a few times
	int tries = 0;
	while(tries < 20)
	{
		if(safeOK(CurArm[P_TF1], CurArm[P_TF2D], CurArm[P_FDLEN2], CurArm[P_SLOPELEN2], CurArm[P_VOLLEN2]))
			break;

		// resample TF2d (conditioned) and TF2-side windows
		CurArm[P_TF2D] = selectArm_TF2d_given_TF1(CurArm[P_TF1]);

		CurArm[P_FDLEN2]    = selectArm_UCB(P_FDLEN2);
		CurArm[P_SLOPELEN2] = selectArm_SlopeLen2_given_FDLen2(CurArm[P_FDLEN2]);
		CurArm[P_VOLLEN2]   = selectArm_UCB(P_VOLLEN2);

		tries += 1;
	}
}

// -----------------------------
// Feature helpers (lite-C safe)
// -----------------------------
function fractalDimKatz(vars P, int N)
{
	if(N < 2) return 1.0;

	var L = 0;
	int i;
	for(i=0; i<N-1; i++)
		L += abs(P[i] - P[i+1]);

	var d = 0;
	for(i=1; i<N; i++)
	{
		var di = abs(P[i] - P[0]);
		if(di > d) d = di;
	}

	if(L <= 0 || d <= 0) return 1.0;

	var n  = (var)N;
	var fd = log(n) / (log(n) + log(d / L));
	return clamp(fd, 1.0, 2.0);
}

function linSlope(vars P, int N)
{
	if(N < 2) return 0;

	var sumT=0, sumP=0, sumTT=0, sumTP=0;
	int i;
	for(i=0; i<N; i++)
	{
		var t = (var)i;
		sumT  += t;
		sumP  += P[i];
		sumTT += t*t;
		sumTP += t*P[i];
	}

	var denom = (var)N*sumTT - sumT*sumT;
	if(abs(denom) < 1e-12) return 0;

	return ((var)N*sumTP - sumT*sumP) / denom;
}

function stdevReturns(vars R, int N)
{
	if(N < 2) return 0;

	var mean = 0;
	int i;
	for(i=0; i<N; i++) mean += R[i];
	mean /= (var)N;

	var v = 0;
	for(i=0; i<N; i++)
	{
		var d = R[i] - mean;
		v += d*d;
	}
	v /= (var)(N-1);

	return sqrt(max(0, v));
}

// ============================================================================
// RUN
// ============================================================================
function run()
{
	BarPeriod = 5;
	StartDate = 20100101;
	EndDate   = 0;

	set(PLOTNOW|RULES|LOGFILE);

	asset("EUR/USD");
	algo("FRACTAL2TF_EUR_RL_v2_COMM_UCB_SYM");

	var eps = 1e-12;
	DataSplit = 50;

	// LookBack must cover the MAX possible TF * MAX window.
	LookBack = 3000;

	// One-time init
	static int Inited = 0;
	static int PrevOpenTotal = 0;
	static var LastBalance = 0;
	static int Flip = 0;

	string LogFN = "Log\\FRACTAL2TF_EUR_RL_v2_COMM_UCB_SYM.csv";

	if(is(FIRSTINITRUN))
	{
		file_delete(LogFN);

		file_append(LogFN,"Date,Time,Mode,Bar,");
		file_append(LogFN,"TF1,TF2,TF2d,FDLen1,SlopeLen1,VolLen1,FDLen2,SlopeLen2,VolLen2,");
		file_append(LogFN,"LevScale,MaxLev,PredThr,HoldBars,");
		file_append(LogFN,"FD1,Slope1,Vol1,FD2,Slope2,Vol2,");
		file_append(LogFN,"PredL,PredS,Pred,Lev,RewardNorm\n");

		Inited = 0;
		PrevOpenTotal = 0;
		LastBalance = 0;
		Flip = 0;
	}

	if(!Inited)
	{
		initParNames();
		initParamsRL();
		pickParams();

		LastBalance = Balance;
		PrevOpenTotal = NumOpenTotal;

		Inited = 1;
	}

	// Convert chosen arms -> parameter values (current episode)
	int TF1 = (int)armValue(P_TF1, CurArm[P_TF1]);
	int TF2d = (int)armValue(P_TF2D, CurArm[P_TF2D]);
	int TF2 = TF1 + TF2d;
	if(TF2 > 12) TF2 = 12;

	int FDLen1    = (int)armValue(P_FDLEN1,    CurArm[P_FDLEN1]);
	int SlopeLen1 = (int)armValue(P_SLOPELEN1, CurArm[P_SLOPELEN1]);
	int VolLen1   = (int)armValue(P_VOLLEN1,   CurArm[P_VOLLEN1]);

	int FDLen2    = (int)armValue(P_FDLEN2,    CurArm[P_FDLEN2]);
	int SlopeLen2 = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
	int VolLen2   = (int)armValue(P_VOLLEN2,   CurArm[P_VOLLEN2]);

	var LevScale  = armValue(P_LEVSCALE, CurArm[P_LEVSCALE]);
	var MaxLev    = armValue(P_MAXLEV,   CurArm[P_MAXLEV]);
	var PredThr   = armValue(P_PREDTHR,  CurArm[P_PREDTHR]);
	int HoldBars  = (int)armValue(P_HOLDBARS, CurArm[P_HOLDBARS]);

	// Build series (2 TF)
	TimeFrame = TF1;
	vars P1 = series(priceClose());
	vars R1 = series(log(max(eps,P1[0]) / max(eps,P1[1])));

	vars FD1S    = series(0);
	vars Slope1S = series(0);
	vars Vol1S   = series(0);

	TimeFrame = TF2;
	vars P2 = series(priceClose());
	vars R2 = series(log(max(eps,P2[0]) / max(eps,P2[1])));

	vars FD2S    = series(0);
	vars Slope2S = series(0);
	vars Vol2S   = series(0);

	TimeFrame = 1;

	// Warmup gate based on current episode params
	int Need1 = max(max(FDLen1, SlopeLen1), VolLen1) + 5;
	int Need2 = max(max(FDLen2, SlopeLen2), VolLen2) + 5;
	int WarmupBars = max(TF1*Need1, TF2*Need2) + 10;

	if(Bar < WarmupBars)
		return;

	// Do NOT block TRAIN during LOOKBACK
	if(is(LOOKBACK) && !Train)
		return;

	// Compute features
	TimeFrame = TF1;
	FD1S[0]    = fractalDimKatz(P1, FDLen1);
	Slope1S[0] = linSlope(P1, SlopeLen1);
	Vol1S[0]   = stdevReturns(R1, VolLen1);

	TimeFrame = TF2;
	FD2S[0]    = fractalDimKatz(P2, FDLen2);
	Slope2S[0] = linSlope(P2, SlopeLen2);
	Vol2S[0]   = stdevReturns(R2, VolLen2);

	TimeFrame = 1;

	var Sig[6];
	Sig[0] = FD1S[0];
	Sig[1] = Slope1S[0];
	Sig[2] = Vol1S[0];
	Sig[3] = FD2S[0];
	Sig[4] = Slope2S[0];
	Sig[5] = Vol2S[0];

	// Trading logic
	int MethodBase = PERCEPTRON + FUZZY + BALANCED;
	int MethodRet  = MethodBase + RETURNS;

	var PredL=0, PredS=0, Pred=0, Lev=0;

	// time-based exit
	if(NumOpenTotal > 0)
		for(open_trades)
			if(TradeIsOpen && TradeBars >= HoldBars)
				exitTrade(ThisTrade);

	if(Train)
	{
		// Forced alternating trades so ML always gets samples
		if(NumOpenTotal == 0)
		{
			Flip = 1 - Flip;
			LastBalance = Balance;

			if(Flip)
			{
				adviseLong(MethodRet, 0, Sig, 6);
				Lots = 1; enterLong();
			}
			else
			{
				adviseShort(MethodRet, 0, Sig, 6);
				Lots = 1; enterShort();
			}
		}
	}
	else
	{
		PredL = adviseLong(MethodBase, 0, Sig, 6);
		PredS = adviseShort(MethodBase, 0, Sig, 6);

		// Bootstrap if model has no signal yet
		if(NumOpenTotal == 0 && PredL == 0 && PredS == 0)
		{
			LastBalance = Balance;
			var s = Sig[1] + Sig[4];
			if(s > 0) { Lots=1; enterLong(); }
			else if(s < 0) { Lots=1; enterShort(); }
			else
			{
				if(random(1) < 0.5) { Lots=1; enterLong(); }
				else                { Lots=1; enterShort(); }
			}
		}
		else
		{
			Pred = PredL - PredS;
			Lev  = clamp(Pred * LevScale, -MaxLev, MaxLev);

			if(Lev > PredThr)       { exitShort(); Lots=1; enterLong();  }
			else if(Lev < -PredThr) { exitLong();  Lots=1; enterShort(); }
			else                    { exitLong();  exitShort(); }
		}
	}

	// RL reward + update (episode ends when we go from having positions to flat)
	var RewardNorm = 0;

	if(PrevOpenTotal > 0 && NumOpenTotal == 0)
	{
		var dBal = Balance - LastBalance;

		// reward normalization: per (1+HoldBars)
		RewardNorm = dBal / (1 + (var)HoldBars);

		// clip reward
		if(RewardNorm >  REWARD_CLIP) RewardNorm =  REWARD_CLIP;
		if(RewardNorm < -REWARD_CLIP) RewardNorm = -REWARD_CLIP;

		// --- Update singles (always, even if RewardNorm==0, for counts/UCB) ---
		int p;
		for(p=0; p<NPAR; p++)
			updateArm(p, CurArm[p], RewardNorm);

		// --- Update residual pair tables (COMMUNICATION LEARNING) ---
		updatePairRes_TF(CurArm[P_TF1], CurArm[P_TF2D], RewardNorm);
		updatePairRes_FD1SL1(CurArm[P_FDLEN1], CurArm[P_SLOPELEN1], RewardNorm);
		updatePairRes_FD2SL2(CurArm[P_FDLEN2], CurArm[P_SLOPELEN2], RewardNorm);
		updatePairRes_LS_PT(CurArm[P_LEVSCALE], CurArm[P_PREDTHR], RewardNorm);
		updatePairRes_ML_HB(CurArm[P_MAXLEV], CurArm[P_HOLDBARS], RewardNorm);

		// Next episode parameters
		pickParams();
		LastBalance = Balance;
	}

	PrevOpenTotal = NumOpenTotal;

	// Logging
	string ModeStr = "Trade";
	if(Train) ModeStr = "Train";
	else if(Test) ModeStr = "Test";

	file_append(LogFN, strf("%04i-%02i-%02i,%02i:%02i,%s,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%.6f,%.3f,%.6f,%d,%.6f,%.8f,%.8f,%.6f,%.8f,%.8f,%.8f,%.8f,%.8f,%.6f,%.6f\n",
		year(0),month(0),day(0), hour(0),minute(0),
		ModeStr, Bar,
		TF1, TF2, TF2d,
		FDLen1, SlopeLen1, VolLen1,
		FDLen2, SlopeLen2, VolLen2,
		LevScale, MaxLev, PredThr, HoldBars,
		Sig[0], Sig[1], Sig[2], Sig[3], Sig[4], Sig[5],
		PredL, PredS, Pred, Lev, RewardNorm
	));

	// Plots
	plot("FD_TF1",    Sig[0], NEW, 0);
	plot("FD_TF2",    Sig[3], 0, 0);
	plot("Slope_TF1", Sig[1], 0, 0);
	plot("Slope_TF2", Sig[4], 0, 0);
	plot("Vol_TF1",   Sig[2], 0, 0);
	plot("Vol_TF2",   Sig[5], 0, 0);
	plot("Pred",      Pred, 0, 0);
	plot("Lev",       Lev, 0, 0);
	plot("RewardN",   RewardNorm, 0, 0);
}

ScaleWeave Sentinel [Re: TipmyPip] #489119
4 hours ago
4 hours ago
Joined: Sep 2017
Posts: 184
TipmyPip Online OP
Member
TipmyPip  Online OP
Member

Joined: Sep 2017
Posts: 184
This sentinel is a two pace fractal learner that tunes itself while it trades. It observes the market through a fast lens and a slow lens, then teaches those lenses to cooperate instead of competing. A bandit controller selects values for timeframes, feature windows, and risk controls. It balances exploration with exploitation by adding an optimism boost for options sampled less often. It records how combinations behave, not just single settings, so coordination improves over time.

Coordination begins with how the slow lens is represented. The strategy does not learn synergy on the offset used to reach the slow lens. It learns synergy on the derived slow timeframe that is actually used to build the slow price series. The fast timeframe and the offset are combined into that final slow timeframe, and cooperation feedback is stored against the pair formed from the fast choice and the final slow choice. If different offsets create the same slow view, they reinforce the same cooperation signal, keeping learning focused on the true slow scale and cleaner credit assignment.

The fast lens can also revise itself after the slow lens is chosen. This is a negotiation step. The fast side reevaluates candidates given the slow decision and the cooperation memories, and it may switch to a better matching fast setting. The revision is applied only sometimes, which preserves randomness and prevents early lock in. This back and forth approximates joint search without enumerating every full combination.

A second communication channel ties the slow timeframe to the slow side window sizes. Window lengths determine how much history the slow lens needs before its features become stable, and they also determine how long warmup lasts. Very large windows combined with very slow sampling can cause long delays, fewer learning episodes, and brittle behavior. Instead of rejecting and resampling until a safe combination appears, the strategy applies a safety cost during selection. When a candidate pairing implies excessive history demand, its score is reduced, steering choices toward feasible regions without hard bans.

To make this interaction explicit, the strategy keeps a dedicated memory table keyed by the slow timeframe and a binned summary of the largest slow window. The summary is the maximum among the slow fractal window, the slow slope window, and the slow volatility window. The table learns which pairings are productive from realized trade outcomes, capturing patterns such as slow sampling working best with modest windows, or moderate sampling supporting broader context without starving the learner of episodes.

Coordination is also improved at the feature level. Alongside the core fractal, slope, and volatility features from both lenses, the strategy adds three alignment tags. One tag describes trend agreement, reporting whether the fast and slow slopes point the same way, point opposite ways, or show no clear direction. A second tag describes volatility agreement, reporting whether the slow view is calmer or noisier relative to the fast view. A third tag describes structural contrast, reporting whether the slow fractal texture looks smoother or more intricate than the fast one.

A further channel links the slow timeframe to the holding horizon. Rewards are normalized by holding duration, and timeframe affects signal frequency, so the best slow sampling pace depends on how long positions are kept. The strategy therefore learns a cooperation table between the slow timeframe and the chosen holding setting. This reduces rhythm failures such as trading too often with long holds, or trading too slowly when the hold setting is short.

Cooperation memories are stored as residual effects. Each single setting has its own running value estimate, and each cooperation table stores only the extra contribution beyond those single values. Pair influence ramps up gradually as joint evidence accumulates, so early noise does not dominate. This keeps the communication signal honest and avoids double counting. Pair memories are updated only when a cycle closes, so the controller learns from outcomes. Rewards are normalized by holding time and clipped to prevent spikes from dominating. This keeps learning stable during training and testing.

During trading, the inner learner converts the feature snapshot into long and short preferences. Position intensity is scaled by a leverage factor, capped by a maximum exposure, and filtered by a threshold so weak signals stay flat. Exits are both signal driven and time driven, creating clear episodes that the controller can score. Over many episodes, ScaleWeave Sentinel becomes a stable coordinator that matches timeframes, windows, and holding rhythm while keeping behavior inside practical limits.

Code
// ============================================================================
// Fractal Learner (Strategy 3) - RL Bandits for Parameters + 2TF ML (RETURNS)
// File: Fractal_EURUSD_2TF_RL_Bandits_v3_TF2_W2_ALIGN_TF2HB_SOFTSAFE.c
//
// Adds requested improvements:
// 1) Communicate on derived TF2 (not TF2d): residual table Qpair_TF12(TF1_arm, TF2_idx)
// 2) Learn TF2 ? W2max interaction: Qpair_TF2_W2(TF2_idx, W2bin)
// 3) Alignment meta-features added to ML: Aslope, Asigma, AFD  (Sig size 9)
// 4) Learn TF2 ? HoldBars: Qpair_TF2_HB(TF2_idx, HoldBars_arm)
// 5) Replace rejection sampling with soft safety penalty in selection scoring
//
// lite-C safe:
// - No ternary operator (uses ifelse())
// - Header uses multiple file_append calls
// - strf format string is ONE literal
// ============================================================================

#define NPAR 12
#define MAXARMS 64

// Exploration / learning
#define EPSILON 0.10
#define ALPHA   0.10

// Communication strength (base)
#define LAMBDA_PAIR 0.30

// UCB constants
#define C_UCB   0.40
#define CP_UCB  0.40

// Pair gating (minimum samples before full lambda)
#define NMIN_PAIR 25

// Reward normalization/clipping
#define REWARD_CLIP 1000.0

// Safe constraint limit (keeps TF2 + windows from exploding warmup)
#define SAFE_LIMIT 2500

// Soft safety penalty weight (score -= ETA_SAFE * pi)
#define ETA_SAFE  2.0

// TF limits
#define TF_MAX 12

// W2 binning (based on your window2 ranges: 10..160 step 5)
#define W2_MIN  10
#define W2_MAX  160
#define W2_STEP 5
#define W2_BINS 31   // (160-10)/5 + 1 = 31

#define P_INT 0
#define P_VAR 1

// Parameter indices (RL-controlled)
#define P_TF1        0
#define P_TF2D       1
#define P_FDLEN1     2
#define P_SLOPELEN1  3
#define P_VOLLEN1    4
#define P_FDLEN2     5
#define P_SLOPELEN2  6
#define P_VOLLEN2    7
#define P_LEVSCALE   8
#define P_MAXLEV     9
#define P_PREDTHR    10
#define P_HOLDBARS   11

// -----------------------------
// RL storage
// -----------------------------
string ParName[NPAR];

int ParType[NPAR] =
{
	P_INT,  // TF1
	P_INT,  // TF2d
	P_INT,  // FDLen1
	P_INT,  // SlopeLen1
	P_INT,  // VolLen1
	P_INT,  // FDLen2
	P_INT,  // SlopeLen2
	P_INT,  // VolLen2
	P_VAR,  // LevScale
	P_VAR,  // MaxLev
	P_VAR,  // PredThr
	P_INT   // HoldBars
};

var ParMin[NPAR];
var ParMax[NPAR];
var ParStep[NPAR];

var Q[NPAR][MAXARMS];
int Ncnt[NPAR][MAXARMS];
int ArmsCount[NPAR];
int CurArm[NPAR];

// Totals for UCB (avoid summing every time)
int TotCnt[NPAR];

// -----------------------------
// Pairwise "communication" RESIDUAL tables
// -----------------------------

// (A) Derived TF2 communication: TF1_arm × TF2_idx(0..11)
var Qpair_TF12[MAXARMS][TF_MAX];
int Npair_TF12[MAXARMS][TF_MAX];
int TotPair_TF12;

// (B) TF2 ? W2max bin: TF2_idx(0..11) × W2bin(0..30)
var Qpair_TF2_W2[TF_MAX][W2_BINS];
int Npair_TF2_W2[TF_MAX][W2_BINS];
int TotPair_TF2_W2;

// (C) Existing FDLen1 ? SlopeLen1
var Qpair_FD1SL1[MAXARMS][MAXARMS];
int Npair_FD1SL1[MAXARMS][MAXARMS];
int TotPair_FD1SL1;

// (D) Existing FDLen2 ? SlopeLen2
var Qpair_FD2SL2[MAXARMS][MAXARMS];
int Npair_FD2SL2[MAXARMS][MAXARMS];
int TotPair_FD2SL2;

// (E) Existing LevScale ? PredThr
var Qpair_LS_PT[MAXARMS][MAXARMS];
int Npair_LS_PT[MAXARMS][MAXARMS];
int TotPair_LS_PT;

// (F) Existing MaxLev ? HoldBars
var Qpair_ML_HB[MAXARMS][MAXARMS];
int Npair_ML_HB[MAXARMS][MAXARMS];
int TotPair_ML_HB;

// (G) NEW: TF2 ? HoldBars_arm (approx TF2-HB factor)
var Qpair_TF2_HB[TF_MAX][MAXARMS];
int Npair_TF2_HB[TF_MAX][MAXARMS];
int TotPair_TF2_HB;

// -----------------------------
// Utility
// -----------------------------
int calcArms(var mn, var mx, var stp)
{
	if(stp <= 0) return 1;
	int n = (int)floor((mx - mn)/stp + 1.000001);
	if(n < 1) n = 1;
	if(n > MAXARMS) n = MAXARMS;
	return n;
}

var armValue(int p, int a)
{
	var v = ParMin[p] + (var)a * ParStep[p];
	if(v < ParMin[p]) v = ParMin[p];
	if(v > ParMax[p]) v = ParMax[p];
	if(ParType[p] == P_INT) v = (var)(int)(v + 0.5);
	return v;
}

void initParNames()
{
	ParName[P_TF1]       = "TF1";
	ParName[P_TF2D]      = "TF2d";
	ParName[P_FDLEN1]    = "FDLen1";
	ParName[P_SLOPELEN1] = "SlopeLen1";
	ParName[P_VOLLEN1]   = "VolLen1";
	ParName[P_FDLEN2]    = "FDLen2";
	ParName[P_SLOPELEN2] = "SlopeLen2";
	ParName[P_VOLLEN2]   = "VolLen2";
	ParName[P_LEVSCALE]  = "LevScale";
	ParName[P_MAXLEV]    = "MaxLev";
	ParName[P_PREDTHR]   = "PredThr";
	ParName[P_HOLDBARS]  = "HoldBars";
}

// UCB bonus: c * sqrt( ln(1+Tot) / (1+n) )
var ucbBonus(int tot, int n, var c)
{
	var lnT = log(1 + (var)tot);
	return c * sqrt(lnT / (1 + (var)n));
}

// Pair gating: lambda_eff = lambda * min(1, nPair/nMin)
var lambdaEff(int nPair)
{
	var frac = (var)nPair / (var)NMIN_PAIR;
	if(frac > 1) frac = 1;
	if(frac < 0) frac = 0;
	return LAMBDA_PAIR * frac;
}

// Derived TF2 value from TF1_arm and TF2d_arm
int tf2ValueFromArms(int armTF1, int armTF2d)
{
	int TF1v  = (int)armValue(P_TF1, armTF1);
	int TF2dv = (int)armValue(P_TF2D, armTF2d);
	int TF2v = TF1v + TF2dv;
	if(TF2v > TF_MAX) TF2v = TF_MAX;
	if(TF2v < 1) TF2v = 1;
	return TF2v;
}

// TF2 index 0..11
int tf2IndexFromValue(int TF2v)
{
	int idx = TF2v - 1;
	if(idx < 0) idx = 0;
	if(idx >= TF_MAX) idx = TF_MAX - 1;
	return idx;
}

// W2 bin index 0..30 from a window value
int w2BinFromValue(int w)
{
	if(w < W2_MIN) w = W2_MIN;
	if(w > W2_MAX) w = W2_MAX;
	return (int)((w - W2_MIN) / W2_STEP);
}

// Soft safety penalty pi(theta) = max(0, (TF2*W2max - SAFE_LIMIT)/SAFE_LIMIT)
var safetyPenalty(int TF2v, int W2maxv)
{
	var x = (var)(TF2v * W2maxv - SAFE_LIMIT);
	if(x <= 0) return 0;
	return x / (var)SAFE_LIMIT;
}

// sign helper: -1,0,1
var sign3(var x)
{
	if(x > 0) return 1;
	if(x < 0) return -1;
	return 0;
}

// -----------------------------
// Single-arm selection with UCB + epsilon explore
// -----------------------------
int bestArm_UCB(int p)
{
	int a, best = 0;
	var bestScore = Q[p][0] + ucbBonus(TotCnt[p], Ncnt[p][0], C_UCB);

	for(a=1; a<ArmsCount[p]; a++)
	{
		var score = Q[p][a] + ucbBonus(TotCnt[p], Ncnt[p][a], C_UCB);
		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_UCB(int p)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[p]);
	return bestArm_UCB(p);
}

// Update single Q
void updateArm(int p, int a, var reward)
{
	Q[p][a] = Q[p][a] + ALPHA*(reward - Q[p][a]);
	Ncnt[p][a] += 1;
	TotCnt[p] += 1;
}

// -----------------------------
// Residual pair updates
// -----------------------------

// (1) TF1 × TF2_idx residual:
// pred = Q_TF1(a1) + Qpair(a1,tf2)
void updatePairRes_TF12(int aTF1, int tf2Idx, var reward)
{
	var pred = Q[P_TF1][aTF1] + Qpair_TF12[aTF1][tf2Idx];
	Qpair_TF12[aTF1][tf2Idx] = Qpair_TF12[aTF1][tf2Idx] + ALPHA*(reward - pred);
	Npair_TF12[aTF1][tf2Idx] += 1;
	TotPair_TF12 += 1;
}

// (2) TF2_idx × W2bin (standalone interaction; no clean singles for derived vars)
// pred = Qpair(tf2,w2)
void updatePair_TF2_W2(int tf2Idx, int w2bin, var reward)
{
	var pred = Qpair_TF2_W2[tf2Idx][w2bin];
	Qpair_TF2_W2[tf2Idx][w2bin] = Qpair_TF2_W2[tf2Idx][w2bin] + ALPHA*(reward - pred);
	Npair_TF2_W2[tf2Idx][w2bin] += 1;
	TotPair_TF2_W2 += 1;
}

// (3) Existing residuals
void updatePairRes_FD1SL1(int a1, int a2, var reward)
{
	var pred = Q[P_FDLEN1][a1] + Q[P_SLOPELEN1][a2] + Qpair_FD1SL1[a1][a2];
	Qpair_FD1SL1[a1][a2] = Qpair_FD1SL1[a1][a2] + ALPHA*(reward - pred);
	Npair_FD1SL1[a1][a2] += 1;
	TotPair_FD1SL1 += 1;
}

void updatePairRes_FD2SL2(int a1, int a2, var reward)
{
	var pred = Q[P_FDLEN2][a1] + Q[P_SLOPELEN2][a2] + Qpair_FD2SL2[a1][a2];
	Qpair_FD2SL2[a1][a2] = Qpair_FD2SL2[a1][a2] + ALPHA*(reward - pred);
	Npair_FD2SL2[a1][a2] += 1;
	TotPair_FD2SL2 += 1;
}

void updatePairRes_LS_PT(int a1, int a2, var reward)
{
	var pred = Q[P_LEVSCALE][a1] + Q[P_PREDTHR][a2] + Qpair_LS_PT[a1][a2];
	Qpair_LS_PT[a1][a2] = Qpair_LS_PT[a1][a2] + ALPHA*(reward - pred);
	Npair_LS_PT[a1][a2] += 1;
	TotPair_LS_PT += 1;
}

void updatePairRes_ML_HB(int a1, int a2, var reward)
{
	var pred = Q[P_MAXLEV][a1] + Q[P_HOLDBARS][a2] + Qpair_ML_HB[a1][a2];
	Qpair_ML_HB[a1][a2] = Qpair_ML_HB[a1][a2] + ALPHA*(reward - pred);
	Npair_ML_HB[a1][a2] += 1;
	TotPair_ML_HB += 1;
}

// (4) NEW TF2_idx × HoldBars_arm residual:
// pred = Q_HB(aHB) + Qpair(tf2,aHB)
void updatePairRes_TF2_HB(int tf2Idx, int aHB, var reward)
{
	var pred = Q[P_HOLDBARS][aHB] + Qpair_TF2_HB[tf2Idx][aHB];
	Qpair_TF2_HB[tf2Idx][aHB] = Qpair_TF2_HB[tf2Idx][aHB] + ALPHA*(reward - pred);
	Npair_TF2_HB[tf2Idx][aHB] += 1;
	TotPair_TF2_HB += 1;
}

// -----------------------------
// Pair-aware selection pieces
// -----------------------------

// --- TF2d selection conditioned on TF1, but "listens" via derived TF2 index ---
// score = UCB_TF2d + lambda_eff*Qpair_TF12(TF1,TF2idx) + pairUCB(TF12)
int bestArm_TF2d_given_TF1_TF2(int armTF1)
{
	int a2d, best = 0;

	int TF2v0 = tf2ValueFromArms(armTF1, 0);
	int tf2Idx0 = tf2IndexFromValue(TF2v0);
	int n0 = Npair_TF12[armTF1][tf2Idx0];
	var lam0 = lambdaEff(n0);

	var pairB0 = lam0*Qpair_TF12[armTF1][tf2Idx0] + ucbBonus(TotPair_TF12, n0, CP_UCB);

	var bestScore =
		Q[P_TF2D][0] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][0], C_UCB) + pairB0;

	for(a2d=1; a2d<ArmsCount[P_TF2D]; a2d++)
	{
		int TF2v = tf2ValueFromArms(armTF1, a2d);
		int tf2Idx = tf2IndexFromValue(TF2v);

		int n = Npair_TF12[armTF1][tf2Idx];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_TF12[armTF1][tf2Idx] + ucbBonus(TotPair_TF12, n, CP_UCB);

		var score =
			Q[P_TF2D][a2d] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][a2d], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a2d;
		}
	}
	return best;
}

int selectArm_TF2d_given_TF1_TF2(int armTF1)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_TF2D]);
	return bestArm_TF2d_given_TF1_TF2(armTF1);
}

// --- TF1 symmetric: anticipates best TF2d response (via TF2 index) ---
int bestArm_TF1_symmetric_TF2()
{
	int a1, best = 0;

	// a1=0 baseline
	var bestPairTerm0 = 0;
	int a2d;
	for(a2d=0; a2d<ArmsCount[P_TF2D]; a2d++)
	{
		int TF2v = tf2ValueFromArms(0, a2d);
		int tf2Idx = tf2IndexFromValue(TF2v);

		int n = Npair_TF12[0][tf2Idx];
		var lam = lambdaEff(n);
		var term = lam*Qpair_TF12[0][tf2Idx] + ucbBonus(TotPair_TF12, n, CP_UCB);

		if(a2d==0) bestPairTerm0 = term;
		else if(term > bestPairTerm0) bestPairTerm0 = term;
	}

	var bestScore =
		Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + bestPairTerm0;

	for(a1=1; a1<ArmsCount[P_TF1]; a1++)
	{
		var bestPairTerm = 0;

		for(a2d=0; a2d<ArmsCount[P_TF2D]; a2d++)
		{
			int TF2v2 = tf2ValueFromArms(a1, a2d);
			int tf2Idx2 = tf2IndexFromValue(TF2v2);

			int n2 = Npair_TF12[a1][tf2Idx2];
			var lam2 = lambdaEff(n2);
			var term2 = lam2*Qpair_TF12[a1][tf2Idx2] + ucbBonus(TotPair_TF12, n2, CP_UCB);

			if(a2d==0) bestPairTerm = term2;
			else if(term2 > bestPairTerm) bestPairTerm = term2;
		}

		var score =
			Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + bestPairTerm;

		if(score > bestScore)
		{
			bestScore = score;
			best = a1;
		}
	}
	return best;
}

int selectArm_TF1_symmetric_TF2()
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_TF1]);
	return bestArm_TF1_symmetric_TF2();
}

// Coordinate descent refinement: TF1 given TF2d, using derived TF2 index
int bestArm_TF1_given_TF2d_TF2(int armTF2d)
{
	int a1, best = 0;

	int TF2v0 = tf2ValueFromArms(0, armTF2d);
	int tf2Idx0 = tf2IndexFromValue(TF2v0);
	int n0 = Npair_TF12[0][tf2Idx0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_TF12[0][tf2Idx0] + ucbBonus(TotPair_TF12, n0, CP_UCB);

	var bestScore =
		Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + pairB0;

	for(a1=1; a1<ArmsCount[P_TF1]; a1++)
	{
		int TF2v = tf2ValueFromArms(a1, armTF2d);
		int tf2Idx = tf2IndexFromValue(TF2v);

		int n = Npair_TF12[a1][tf2Idx];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_TF12[a1][tf2Idx] + ucbBonus(TotPair_TF12, n, CP_UCB);

		var score =
			Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a1;
		}
	}
	return best;
}

// --- TF2 ? W2max influence + soft safety applied during window2 selection ---

// Score contribution from TF2-W2 table:
var tf2w2Term(int tf2Idx, int w2bin)
{
	int n = Npair_TF2_W2[tf2Idx][w2bin];
	var lam = lambdaEff(n);
	return lam*Qpair_TF2_W2[tf2Idx][w2bin] + ucbBonus(TotPair_TF2_W2, n, CP_UCB);
}

// FDLen2 choice conditioned on TF2 (uses TF2-W2 term + penalty)
int bestArm_FDLen2_given_TF2(int tf2Idx, int TF2v)
{
	int a, best = 0;

	int v0 = (int)armValue(P_FDLEN2, 0);
	int w2bin0 = w2BinFromValue(v0);
	var pen0 = safetyPenalty(TF2v, v0);
	var bestScore =
		Q[P_FDLEN2][0] + ucbBonus(TotCnt[P_FDLEN2], Ncnt[P_FDLEN2][0], C_UCB)
		+ tf2w2Term(tf2Idx, w2bin0)
		- ETA_SAFE*pen0;

	for(a=1; a<ArmsCount[P_FDLEN2]; a++)
	{
		int v = (int)armValue(P_FDLEN2, a);
		int w2bin = w2BinFromValue(v);
		var pen = safetyPenalty(TF2v, v);

		var score =
			Q[P_FDLEN2][a] + ucbBonus(TotCnt[P_FDLEN2], Ncnt[P_FDLEN2][a], C_UCB)
			+ tf2w2Term(tf2Idx, w2bin)
			- ETA_SAFE*pen;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_FDLen2_given_TF2(int tf2Idx, int TF2v)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_FDLEN2]);
	return bestArm_FDLen2_given_TF2(tf2Idx, TF2v);
}

// SlopeLen2 conditioned on FDLen2 AND TF2 (FD2-SL2 pair + TF2-W2 + penalty)
int bestArm_SlopeLen2_given_FD2_TF2(int armFD2, int tf2Idx, int TF2v)
{
	int a, best = 0;

	int vFD = (int)armValue(P_FDLEN2, armFD2);

	// a=0 baseline
	int v0 = (int)armValue(P_SLOPELEN2, 0);
	int mx0 = vFD; if(v0 > mx0) mx0 = v0;
	int w2bin0 = w2BinFromValue(mx0);
	var pen0 = safetyPenalty(TF2v, mx0);

	// FD2-SL2 pair term
	int nP0 = Npair_FD2SL2[armFD2][0];
	var lamP0 = lambdaEff(nP0);
	var fdslTerm0 = lamP0*Qpair_FD2SL2[armFD2][0] + ucbBonus(TotPair_FD2SL2, nP0, CP_UCB);

	var bestScore =
		Q[P_SLOPELEN2][0] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][0], C_UCB)
		+ fdslTerm0
		+ tf2w2Term(tf2Idx, w2bin0)
		- ETA_SAFE*pen0;

	for(a=1; a<ArmsCount[P_SLOPELEN2]; a++)
	{
		int v = (int)armValue(P_SLOPELEN2, a);
		int mx = vFD; if(v > mx) mx = v;
		int w2bin = w2BinFromValue(mx);
		var pen = safetyPenalty(TF2v, mx);

		int nP = Npair_FD2SL2[armFD2][a];
		var lamP = lambdaEff(nP);
		var fdslTerm = lamP*Qpair_FD2SL2[armFD2][a] + ucbBonus(TotPair_FD2SL2, nP, CP_UCB);

		var score =
			Q[P_SLOPELEN2][a] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][a], C_UCB)
			+ fdslTerm
			+ tf2w2Term(tf2Idx, w2bin)
			- ETA_SAFE*pen;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_SlopeLen2_given_FD2_TF2(int armFD2, int tf2Idx, int TF2v)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_SLOPELEN2]);
	return bestArm_SlopeLen2_given_FD2_TF2(armFD2, tf2Idx, TF2v);
}

// VolLen2 conditioned on current W2max and TF2 (TF2-W2 + penalty)
int bestArm_VolLen2_given_TF2(int curMax, int tf2Idx, int TF2v)
{
	int a, best = 0;

	int v0 = (int)armValue(P_VOLLEN2, 0);
	int mx0 = curMax; if(v0 > mx0) mx0 = v0;
	int w2bin0 = w2BinFromValue(mx0);
	var pen0 = safetyPenalty(TF2v, mx0);

	var bestScore =
		Q[P_VOLLEN2][0] + ucbBonus(TotCnt[P_VOLLEN2], Ncnt[P_VOLLEN2][0], C_UCB)
		+ tf2w2Term(tf2Idx, w2bin0)
		- ETA_SAFE*pen0;

	for(a=1; a<ArmsCount[P_VOLLEN2]; a++)
	{
		int v = (int)armValue(P_VOLLEN2, a);
		int mx = curMax; if(v > mx) mx = v;
		int w2bin = w2BinFromValue(mx);
		var pen = safetyPenalty(TF2v, mx);

		var score =
			Q[P_VOLLEN2][a] + ucbBonus(TotCnt[P_VOLLEN2], Ncnt[P_VOLLEN2][a], C_UCB)
			+ tf2w2Term(tf2Idx, w2bin)
			- ETA_SAFE*pen;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_VolLen2_given_TF2(int curMax, int tf2Idx, int TF2v)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_VOLLEN2]);
	return bestArm_VolLen2_given_TF2(curMax, tf2Idx, TF2v);
}

// --- Existing FDLen1 ? SlopeLen1 (unchanged) ---
int bestArm_SlopeLen1_given_FDLen1(int armFD1)
{
	int a, best = 0;

	int n0 = Npair_FD1SL1[armFD1][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_FD1SL1[armFD1][0] + ucbBonus(TotPair_FD1SL1, n0, CP_UCB);

	var bestScore =
		Q[P_SLOPELEN1][0] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_SLOPELEN1]; a++)
	{
		int n = Npair_FD1SL1[armFD1][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_FD1SL1[armFD1][a] + ucbBonus(TotPair_FD1SL1, n, CP_UCB);

		var score =
			Q[P_SLOPELEN1][a] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_SlopeLen1_given_FDLen1(int armFD1)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_SLOPELEN1]);
	return bestArm_SlopeLen1_given_FDLen1(armFD1);
}

// --- LevScale -> PredThr (unchanged) ---
int bestArm_PredThr_given_LevScale(int armLS)
{
	int a, best = 0;

	int n0 = Npair_LS_PT[armLS][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_LS_PT[armLS][0] + ucbBonus(TotPair_LS_PT, n0, CP_UCB);

	var bestScore =
		Q[P_PREDTHR][0] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_PREDTHR]; a++)
	{
		int n = Npair_LS_PT[armLS][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_LS_PT[armLS][a] + ucbBonus(TotPair_LS_PT, n, CP_UCB);

		var score =
			Q[P_PREDTHR][a] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_PredThr_given_LevScale(int armLS)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_PREDTHR]);
	return bestArm_PredThr_given_LevScale(armLS);
}

// --- HoldBars selection conditioned on MaxLev AND TF2_idx (adds TF2-HB communication) ---
int bestArm_HoldBars_given_MaxLev_TF2(int armML, int tf2Idx)
{
	int a, best = 0;

	// ML-HB term for a=0
	int n0 = Npair_ML_HB[armML][0];
	var lam0 = lambdaEff(n0);
	var mlhb0 = lam0*Qpair_ML_HB[armML][0] + ucbBonus(TotPair_ML_HB, n0, CP_UCB);

	// TF2-HB term for a=0
	int nT0 = Npair_TF2_HB[tf2Idx][0];
	var lamT0 = lambdaEff(nT0);
	var tf2hb0 = lamT0*Qpair_TF2_HB[tf2Idx][0] + ucbBonus(TotPair_TF2_HB, nT0, CP_UCB);

	var bestScore =
		Q[P_HOLDBARS][0] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][0], C_UCB)
		+ mlhb0 + tf2hb0;

	for(a=1; a<ArmsCount[P_HOLDBARS]; a++)
	{
		int n1 = Npair_ML_HB[armML][a];
		var lam1 = lambdaEff(n1);
		var mlhb = lam1*Qpair_ML_HB[armML][a] + ucbBonus(TotPair_ML_HB, n1, CP_UCB);

		int nT = Npair_TF2_HB[tf2Idx][a];
		var lamT = lambdaEff(nT);
		var tf2hb = lamT*Qpair_TF2_HB[tf2Idx][a] + ucbBonus(TotPair_TF2_HB, nT, CP_UCB);

		var score =
			Q[P_HOLDBARS][a] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][a], C_UCB)
			+ mlhb + tf2hb;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_HoldBars_given_MaxLev_TF2(int armML, int tf2Idx)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_HOLDBARS]);
	return bestArm_HoldBars_given_MaxLev_TF2(armML, tf2Idx);
}

// -----------------------------
// Init ranges + tables
// -----------------------------
void initParamsRL()
{
	// Ranges roughly matching your old optimize() ranges/steps
	ParMin[P_TF1]        = 1;    ParMax[P_TF1]        = 3;    ParStep[P_TF1]        = 1;
	ParMin[P_TF2D]       = 1;    ParMax[P_TF2D]       = 11;   ParStep[P_TF2D]       = 1;

	ParMin[P_FDLEN1]     = 20;   ParMax[P_FDLEN1]     = 220;  ParStep[P_FDLEN1]     = 5;
	ParMin[P_SLOPELEN1]  = 20;   ParMax[P_SLOPELEN1]  = 200;  ParStep[P_SLOPELEN1]  = 5;
	ParMin[P_VOLLEN1]    = 20;   ParMax[P_VOLLEN1]    = 200;  ParStep[P_VOLLEN1]    = 5;

	ParMin[P_FDLEN2]     = 10;   ParMax[P_FDLEN2]     = 160;  ParStep[P_FDLEN2]     = 5;
	ParMin[P_SLOPELEN2]  = 10;   ParMax[P_SLOPELEN2]  = 140;  ParStep[P_SLOPELEN2]  = 5;
	ParMin[P_VOLLEN2]    = 10;   ParMax[P_VOLLEN2]    = 140;  ParStep[P_VOLLEN2]    = 5;

	ParMin[P_LEVSCALE]   = 2;    ParMax[P_LEVSCALE]   = 30;   ParStep[P_LEVSCALE]   = 1;
	ParMin[P_MAXLEV]     = 0.1;  ParMax[P_MAXLEV]     = 1.0;  ParStep[P_MAXLEV]     = 0.1;
	ParMin[P_PREDTHR]    = 0.0;  ParMax[P_PREDTHR]    = 0.20; ParStep[P_PREDTHR]    = 0.01;
	ParMin[P_HOLDBARS]   = 1;    ParMax[P_HOLDBARS]   = 30;   ParStep[P_HOLDBARS]   = 1;

	int p, a;
	for(p=0; p<NPAR; p++)
	{
		ArmsCount[p] = calcArms(ParMin[p], ParMax[p], ParStep[p]);
		CurArm[p] = 0;
		TotCnt[p] = 0;
		for(a=0; a<ArmsCount[p]; a++)
		{
			Q[p][a] = 0;
			Ncnt[p][a] = 0;
		}
	}

	// Init pair tables
	int i,j;

	TotPair_TF12 = 0;
	TotPair_TF2_W2 = 0;
	TotPair_FD1SL1 = 0;
	TotPair_FD2SL2 = 0;
	TotPair_LS_PT = 0;
	TotPair_ML_HB = 0;
	TotPair_TF2_HB = 0;

	// TF12 table
	for(i=0; i<MAXARMS; i++)
	{
		for(j=0; j<TF_MAX; j++)
		{
			Qpair_TF12[i][j] = 0;
			Npair_TF12[i][j] = 0;
		}
	}

	// TF2-W2 table
	for(i=0; i<TF_MAX; i++)
	{
		for(j=0; j<W2_BINS; j++)
		{
			Qpair_TF2_W2[i][j] = 0;
			Npair_TF2_W2[i][j] = 0;
		}
	}

	// Large MAXARMS×MAXARMS tables
	for(i=0; i<MAXARMS; i++)
	{
		for(j=0; j<MAXARMS; j++)
		{
			Qpair_FD1SL1[i][j] = 0;  Npair_FD1SL1[i][j] = 0;
			Qpair_FD2SL2[i][j] = 0;  Npair_FD2SL2[i][j] = 0;
			Qpair_LS_PT[i][j]  = 0;  Npair_LS_PT[i][j]  = 0;
			Qpair_ML_HB[i][j]  = 0;  Npair_ML_HB[i][j]  = 0;
		}
	}

	// TF2-HB table: TF_MAX × MAXARMS
	for(i=0; i<TF_MAX; i++)
	{
		for(j=0; j<MAXARMS; j++)
		{
			Qpair_TF2_HB[i][j] = 0;
			Npair_TF2_HB[i][j] = 0;
		}
	}
}

// -----------------------------
// Parameter pick order (TF2-aware, TF2-W2 aware, TF2-HB aware, soft safety)
// -----------------------------
void pickParams()
{
	int p;

	// 1) TF1 symmetric pick (anticipate TF2d best response) using derived TF2 table
	CurArm[P_TF1] = selectArm_TF1_symmetric_TF2();

	// 2) TF2d conditioned on TF1 using derived TF2 communication
	CurArm[P_TF2D] = selectArm_TF2d_given_TF1_TF2(CurArm[P_TF1]);

	// 3) Coordinate descent: re-pick TF1 given TF2d
	if(random(1) >= EPSILON)
		CurArm[P_TF1] = bestArm_TF1_given_TF2d_TF2(CurArm[P_TF2D]);

	// Derived TF2 for the episode (used by TF2-side window selection and TF2-HB)
	int TF2v = tf2ValueFromArms(CurArm[P_TF1], CurArm[P_TF2D]);
	int tf2Idx = tf2IndexFromValue(TF2v);

	// 4) FD/Slope pair TF1
	CurArm[P_FDLEN1]    = selectArm_UCB(P_FDLEN1);
	CurArm[P_SLOPELEN1] = selectArm_SlopeLen1_given_FDLen1(CurArm[P_FDLEN1]);

	// 5) TF2-side windows are TF2-aware (TF2-W2) + FD2-SL2 communication + soft safety
	CurArm[P_FDLEN2]    = selectArm_FDLen2_given_TF2(tf2Idx, TF2v);
	CurArm[P_SLOPELEN2] = selectArm_SlopeLen2_given_FD2_TF2(CurArm[P_FDLEN2], tf2Idx, TF2v);

	// current max after FD2+SL2
	int vFD2 = (int)armValue(P_FDLEN2, CurArm[P_FDLEN2]);
	int vSL2 = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
	int curMax = vFD2; if(vSL2 > curMax) curMax = vSL2;

	CurArm[P_VOLLEN2]   = selectArm_VolLen2_given_TF2(curMax, tf2Idx, TF2v);

	// 6) Remaining independent params (except the paired ones we do later)
	for(p=0; p<NPAR; p++)
	{
		if(p == P_TF1) continue;
		if(p == P_TF2D) continue;
		if(p == P_FDLEN1) continue;
		if(p == P_SLOPELEN1) continue;
		if(p == P_FDLEN2) continue;
		if(p == P_SLOPELEN2) continue;
		if(p == P_VOLLEN2) continue;
		CurArm[p] = selectArm_UCB(p);
	}

	// 7) LevScale -> PredThr conditioned
	CurArm[P_LEVSCALE] = selectArm_UCB(P_LEVSCALE);
	CurArm[P_PREDTHR]  = selectArm_PredThr_given_LevScale(CurArm[P_LEVSCALE]);

	// 8) MaxLev -> HoldBars conditioned AND TF2-HB communication
	CurArm[P_MAXLEV]   = selectArm_UCB(P_MAXLEV);
	CurArm[P_HOLDBARS] = selectArm_HoldBars_given_MaxLev_TF2(CurArm[P_MAXLEV], tf2Idx);
}

// -----------------------------
// Feature helpers (lite-C safe)
// -----------------------------
function fractalDimKatz(vars P, int N)
{
	if(N < 2) return 1.0;

	var L = 0;
	int i;
	for(i=0; i<N-1; i++)
		L += abs(P[i] - P[i+1]);

	var d = 0;
	for(i=1; i<N; i++)
	{
		var di = abs(P[i] - P[0]);
		if(di > d) d = di;
	}

	if(L <= 0 || d <= 0) return 1.0;

	var n  = (var)N;
	var fd = log(n) / (log(n) + log(d / L));
	return clamp(fd, 1.0, 2.0);
}

function linSlope(vars P, int N)
{
	if(N < 2) return 0;

	var sumT=0, sumP=0, sumTT=0, sumTP=0;
	int i;
	for(i=0; i<N; i++)
	{
		var t = (var)i;
		sumT  += t;
		sumP  += P[i];
		sumTT += t*t;
		sumTP += t*P[i];
	}

	var denom = (var)N*sumTT - sumT*sumT;
	if(abs(denom) < 1e-12) return 0;

	return ((var)N*sumTP - sumT*sumP) / denom;
}

function stdevReturns(vars R, int N)
{
	if(N < 2) return 0;

	var mean = 0;
	int i;
	for(i=0; i<N; i++) mean += R[i];
	mean /= (var)N;

	var v = 0;
	for(i=0; i<N; i++)
	{
		var d = R[i] - mean;
		v += d*d;
	}
	v /= (var)(N-1);

	return sqrt(max(0, v));
}

// ============================================================================
// RUN
// ============================================================================
function run()
{
	BarPeriod = 15;
	StartDate = 20100101;
	EndDate   = 0;

	set(PLOTNOW|RULES|LOGFILE);

	asset("EUR/USD");
	algo("FRACTAL2TF");

	var eps = 1e-12;
	DataSplit = 50;

	LookBack = 3000;

	// One-time init
	static int Inited = 0;
	static int PrevOpenTotal = 0;
	static var LastBalance = 0;
	static int Flip = 0;

	string LogFN = "Log\\FRACTAL2TF.csv";

	if(is(FIRSTINITRUN))
	{
		file_delete(LogFN);

		file_append(LogFN,"Date,Time,Mode,Bar,");
		file_append(LogFN,"TF1,TF2,TF2d,FDLen1,SlopeLen1,VolLen1,FDLen2,SlopeLen2,VolLen2,");
		file_append(LogFN,"LevScale,MaxLev,PredThr,HoldBars,");
		file_append(LogFN,"FD1,Slope1,Vol1,FD2,Slope2,Vol2,Aslope,Asigma,AFD,");
		file_append(LogFN,"PredL,PredS,Pred,Lev,RewardNorm,TF2idx,W2max,W2bin\n");

		Inited = 0;
		PrevOpenTotal = 0;
		LastBalance = 0;
		Flip = 0;
	}

	if(!Inited)
	{
		initParNames();
		initParamsRL();
		pickParams();

		LastBalance = Balance;
		PrevOpenTotal = NumOpenTotal;

		Inited = 1;
	}

	// Convert chosen arms -> parameter values (current episode)
	int TF1 = (int)armValue(P_TF1, CurArm[P_TF1]);
	int TF2d = (int)armValue(P_TF2D, CurArm[P_TF2D]);
	int TF2 = TF1 + TF2d;
	if(TF2 > TF_MAX) TF2 = TF_MAX;

	int FDLen1    = (int)armValue(P_FDLEN1,    CurArm[P_FDLEN1]);
	int SlopeLen1 = (int)armValue(P_SLOPELEN1, CurArm[P_SLOPELEN1]);
	int VolLen1   = (int)armValue(P_VOLLEN1,   CurArm[P_VOLLEN1]);

	int FDLen2    = (int)armValue(P_FDLEN2,    CurArm[P_FDLEN2]);
	int SlopeLen2 = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
	int VolLen2   = (int)armValue(P_VOLLEN2,   CurArm[P_VOLLEN2]);

	var LevScale  = armValue(P_LEVSCALE, CurArm[P_LEVSCALE]);
	var MaxLev    = armValue(P_MAXLEV,   CurArm[P_MAXLEV]);
	var PredThr   = armValue(P_PREDTHR,  CurArm[P_PREDTHR]);
	int HoldBars  = (int)armValue(P_HOLDBARS, CurArm[P_HOLDBARS]);

	// Derived indices for logging + pair updates
	int tf2Idx = tf2IndexFromValue(TF2);

	int W2max = FDLen2;
	if(SlopeLen2 > W2max) W2max = SlopeLen2;
	if(VolLen2 > W2max) W2max = VolLen2;
	int W2bin = w2BinFromValue(W2max);

	// Build series (2 TF)
	TimeFrame = TF1;
	vars P1 = series(priceClose());
	vars R1 = series(log(max(eps,P1[0]) / max(eps,P1[1])));

	vars FD1S    = series(0);
	vars Slope1S = series(0);
	vars Vol1S   = series(0);

	TimeFrame = TF2;
	vars P2 = series(priceClose());
	vars R2 = series(log(max(eps,P2[0]) / max(eps,P2[1])));

	vars FD2S    = series(0);
	vars Slope2S = series(0);
	vars Vol2S   = series(0);

	TimeFrame = 1;

	// Warmup gate based on current episode params
	int Need1 = max(max(FDLen1, SlopeLen1), VolLen1) + 5;
	int Need2 = max(max(FDLen2, SlopeLen2), VolLen2) + 5;
	int WarmupBars = max(TF1*Need1, TF2*Need2) + 10;

	if(Bar < WarmupBars)
		return;

	// Do NOT block TRAIN during LOOKBACK
	if(is(LOOKBACK) && !Train)
		return;

	// Compute features
	TimeFrame = TF1;
	FD1S[0]    = fractalDimKatz(P1, FDLen1);
	Slope1S[0] = linSlope(P1, SlopeLen1);
	Vol1S[0]   = stdevReturns(R1, VolLen1);

	TimeFrame = TF2;
	FD2S[0]    = fractalDimKatz(P2, FDLen2);
	Slope2S[0] = linSlope(P2, SlopeLen2);
	Vol2S[0]   = stdevReturns(R2, VolLen2);

	TimeFrame = 1;

	// Alignment meta-features (Improvement 3)
	var Aslope = sign3(Slope1S[0]) * sign3(Slope2S[0]);          // -1,0,1
	var Asigma = Vol2S[0] / (Vol1S[0] + eps);                    // ratio
	var AFD    = FD2S[0] - FD1S[0];                              // contrast

	// Feature vector for ML: now 9 features
	var Sig[9];
	Sig[0] = FD1S[0];
	Sig[1] = Slope1S[0];
	Sig[2] = Vol1S[0];
	Sig[3] = FD2S[0];
	Sig[4] = Slope2S[0];
	Sig[5] = Vol2S[0];
	Sig[6] = Aslope;
	Sig[7] = Asigma;
	Sig[8] = AFD;

	// Trading logic
	int MethodBase = PERCEPTRON + FUZZY + BALANCED;
	int MethodRet  = MethodBase + RETURNS;

	var PredL=0, PredS=0, Pred=0, Lev=0;

	// time-based exit
	if(NumOpenTotal > 0)
		for(open_trades)
			if(TradeIsOpen && TradeBars >= HoldBars)
				exitTrade(ThisTrade);

	if(Train)
	{
		// Forced alternating trades so ML always gets samples
		if(NumOpenTotal == 0)
		{
			Flip = 1 - Flip;
			LastBalance = Balance;

			if(Flip)
			{
				adviseLong(MethodRet, 0, Sig, 9);
				Lots = 1; enterLong();
			}
			else
			{
				adviseShort(MethodRet, 0, Sig, 9);
				Lots = 1; enterShort();
			}
		}
	}
	else
	{
		PredL = adviseLong(MethodBase, 0, Sig, 9);
		PredS = adviseShort(MethodBase, 0, Sig, 9);

		// Bootstrap if model has no signal yet
		if(NumOpenTotal == 0 && PredL == 0 && PredS == 0)
		{
			LastBalance = Balance;
			var s = Sig[1] + Sig[4];
			if(s > 0) { Lots=1; enterLong(); }
			else if(s < 0) { Lots=1; enterShort(); }
			else
			{
				if(random(1) < 0.5) { Lots=1; enterLong(); }
				else                { Lots=1; enterShort(); }
			}
		}
		else
		{
			Pred = PredL - PredS;
			Lev  = clamp(Pred * LevScale, -MaxLev, MaxLev);

			if(Lev > PredThr)       { exitShort(); Lots=1; enterLong();  }
			else if(Lev < -PredThr) { exitLong();  Lots=1; enterShort(); }
			else                    { exitLong();  exitShort(); }
		}
	}

	// RL reward + update (episode ends when we go from having positions to flat)
	var RewardNorm = 0;

	if(PrevOpenTotal > 0 && NumOpenTotal == 0)
	{
		var dBal = Balance - LastBalance;

		// reward normalization: per (1+HoldBars)
		RewardNorm = dBal / (1 + (var)HoldBars);

		// clip reward
		if(RewardNorm >  REWARD_CLIP) RewardNorm =  REWARD_CLIP;
		if(RewardNorm < -REWARD_CLIP) RewardNorm = -REWARD_CLIP;

		// Update singles (always)
		int p;
		for(p=0; p<NPAR; p++)
			updateArm(p, CurArm[p], RewardNorm);

		// Derived TF2 index (based on CURRENT arms)
		int tf2v_now = tf2ValueFromArms(CurArm[P_TF1], CurArm[P_TF2D]);
		int tf2Idx_now = tf2IndexFromValue(tf2v_now);

		// W2max bin (based on CURRENT arms)
		int FD2v = (int)armValue(P_FDLEN2, CurArm[P_FDLEN2]);
		int SL2v = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
		int VL2v = (int)armValue(P_VOLLEN2, CurArm[P_VOLLEN2]);
		int w2mx = FD2v;
		if(SL2v > w2mx) w2mx = SL2v;
		if(VL2v > w2mx) w2mx = VL2v;
		int w2bin_now = w2BinFromValue(w2mx);

		// Update derived communication tables
		updatePairRes_TF12(CurArm[P_TF1], tf2Idx_now, RewardNorm);
		updatePair_TF2_W2(tf2Idx_now, w2bin_now, RewardNorm);
		updatePairRes_TF2_HB(tf2Idx_now, CurArm[P_HOLDBARS], RewardNorm);

		// Update existing residual pairs
		updatePairRes_FD1SL1(CurArm[P_FDLEN1], CurArm[P_SLOPELEN1], RewardNorm);
		updatePairRes_FD2SL2(CurArm[P_FDLEN2], CurArm[P_SLOPELEN2], RewardNorm);
		updatePairRes_LS_PT(CurArm[P_LEVSCALE], CurArm[P_PREDTHR], RewardNorm);
		updatePairRes_ML_HB(CurArm[P_MAXLEV], CurArm[P_HOLDBARS], RewardNorm);

		// Next episode parameters
		pickParams();
		LastBalance = Balance;
	}

	PrevOpenTotal = NumOpenTotal;

	// Logging
	string ModeStr = "Trade";
	if(Train) ModeStr = "Train";
	else if(Test) ModeStr = "Test";

	file_append(LogFN, strf("%04i-%02i-%02i,%02i:%02i,%s,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%.6f,%.3f,%.6f,%d,%.6f,%.8f,%.8f,%.6f,%.8f,%.8f,%.8f,%.6f,%.6f,%.6f,%.6f,%.6f,%d,%d,%d\n",
		year(0),month(0),day(0), hour(0),minute(0),
		ModeStr, Bar,
		TF1, TF2, TF2d,
		FDLen1, SlopeLen1, VolLen1,
		FDLen2, SlopeLen2, VolLen2,
		LevScale, MaxLev, PredThr, HoldBars,
		Sig[0], Sig[1], Sig[2], Sig[3], Sig[4], Sig[5],
		Sig[6], Sig[7], Sig[8],
		PredL, PredS, Pred, Lev, RewardNorm,
		tf2Idx, W2max, W2bin
	));

	// Plots
	plot("FD_TF1",    Sig[0], NEW, 0);
	plot("FD_TF2",    Sig[3], 0, 0);
	plot("Slope_TF1", Sig[1], 0, 0);
	plot("Slope_TF2", Sig[4], 0, 0);
	plot("Vol_TF1",   Sig[2], 0, 0);
	plot("Vol_TF2",   Sig[5], 0, 0);
	plot("Aslope",    Sig[6], 0, 0);
	plot("Asigma",    Sig[7], 0, 0);
	plot("AFD",       Sig[8], 0, 0);
	plot("Pred",      Pred, 0, 0);
	plot("Lev",       Lev, 0, 0);
	plot("RewardN",   RewardNorm, 0, 0);
}

Runge–Kutta methods [Re: TipmyPip] #489121
1 hour ago
1 hour ago
Joined: Sep 2017
Posts: 184
TipmyPip Online OP
Member
TipmyPip  Online OP
Member

Joined: Sep 2017
Posts: 184
Stage-relabeling invariance for Runge–Kutta methods

In normal Runge–Kutta / numerical methods language, the idea is usually described like this:

1) Runge–Kutta stage reordering does not change the method

An explicit Runge–Kutta method works by doing several internal “sub-steps” (called stages) inside one time step.
The theorem says:

If you relabel or reorder those stages (stage 1 becomes stage 2, stage 2 becomes stage 1, etc.)

and you also rearrange the method’s coefficient tables to match that new ordering then the method still produces the same final result for the step.

So even though the internal bookkeeping changes, the actual computed step result does not change (except for tiny rounding differences).

This is commonly called: Stage permutation invariance or invariance under stage relabeling

2) This still stays true when you chain methods

Your code doesn’t just run one RK method; it chains two of them: Run method 2 for one step, then run method 1 for one step

The theorem also says:

If you replace method 1 with an equivalent “stage-reordered version” and/or replace method 2 with an equivalent “stage-reordered version”
then the chained result is still the same.

This is often described as:

Composition is compatible with stage reordering or more formally: composition is well-defined under the equivalence

If you want one “research-style umbrella description” (still readable):

Reordering stages does not change RK results, and it also does not change the results of chaining RK methods.

Short introduction to theorem_382A (in terms of your code)

Your code is testing this exact claim:

You build two normal RK methods:

Heun RK2 (2 stages) Classical RK4 (4 stages)

You build reordered versions of the same methods by shuffling their stage order, while also shuffling the tables correctly.

You run both versions on the same ODE step and compare:

Original chain: RK4 step then Heun step

Reordered chain: permuted RK4 step then permuted Heun step

The theorem predicts:

The two final answers should match (tiny floating-point rounding differences are okay)

So the code shows you can reorder internal stages for implementation reasons (like memory layout or simpler storage) without changing what the method computes.

How demo_theorem_382A_5cases_main.c shows this, step by step
Step 1 — Build the original RK methods

Your code creates two well-known methods:

make_heun_rk2() fills M1_s, M1_A, M1_b, M1_c
This represents Heun’s RK2.

make_rk4() fills M2_s, M2_A, M2_b, M2_c
This represents classical RK4.

Each method is stored in the standard RK “tableau” parts:

s = how many stages the method has

c[] = where each stage evaluates time inside the step

b[] = how the final result mixes the stage derivatives

A = how later stages use earlier stage derivatives

Important detail in your implementation:
You store only the lower triangle of A (Atri), because explicit RK never needs “future stage depends on future stage” terms.

That’s why you use:

Aidx(i,j), getA(), setA() for triangular indexing.

Step 2 — Build equivalent “stage-reordered” versions

Inside buildAll() you define permutations:

p1 reorders Heun’s stages

p2 reorders RK4’s stages

Then permute_method_into(...) produces:

M1B_* from M1_*

M2B_* from M2_*

What that function does (in plain words):

It rearranges b[] and c[] so stage i in the new method corresponds to the correct original stage.

It also rearranges the stored A entries so that “stage i depends on stage j” is preserved under the new labels.

So the permuted versions are not “different methods.”
They are the same methods written with a different stage numbering.

That is exactly what theorem_382A is about.

Step 3 — Define composition exactly as the theorem describes

The theorem talks about chaining methods.

Your code implements chaining here:

var compose_step(...)
{
var y_mid = rk_step(method2, ...);
var y_out = rk_step(method1, ..., y_mid, ...);
return y_out;
}


So composition means:

Apply method 2 for one step ? get intermediate result

Apply method 1 for one step ? get final result

Then your comparison is:

yA = composition using original methods (M1 and M2)

yB = composition using permuted methods (M1B and M2B)

The theorem predicts: yA and yB should match.

Step 4 — Test it on 5 different ODE examples

run_case(k, ...) sets G_CASE = k, which changes what f_ode(t,y) returns.

You test five different behaviors: simple decay, simple growth, faster decay (stiffer), nonlinear growth with saturation (logistic)
forcing term that depends on time.

This shows the theorem isn’t only working for one “easy” equation — it works across different ODE styles.

Step 5 — What the printed output means

Each test prints: yA result (original chain), yB result (permuted chain), diff = how far apart they are
If theorem_382A is correct (and your permutation is applied correctly): diff should be extremely small (often near machine precision)

Code
// demo_theorem_5cases.c (Zorro lite-C)
// Demonstrates theorem_382A: stage-permuted RK methods are equivalent.

#define MAXS 4
#define TRI_A 6

static var G_k[MAXS];

// ---- RK tableaus (Heun2 and RK4) + permuted versions ----
static int   M1_s = 0; static float M1_A[TRI_A]; static float M1_b[MAXS]; static float M1_c[MAXS];
static int   M2_s = 0; static float M2_A[TRI_A]; static float M2_b[MAXS]; static float M2_c[MAXS];
static int   M1B_s = 0; static float M1B_A[TRI_A]; static float M1B_b[MAXS]; static float M1B_c[MAXS];
static int   M2B_s = 0; static float M2B_A[TRI_A]; static float M2B_b[MAXS]; static float M2B_c[MAXS];

// ---- selects which ODE example to run ----
static int G_CASE = 0;

int Aidx(int i, int j)
{
  return (i*(i-1))/2 + j;
}

var getA(float* Atri, int i, int j)
{
  if(i <= j) return 0;
  return (var)Atri[Aidx(i,j)];
}

void setA(float* Atri, int i, int j, var v)
{
  if(i <= j) return;
  Atri[Aidx(i,j)] = (float)v;
}

void zeroTableau(int* s, float* Atri, float* b, float* c)
{
  int i;
  *s = 0;
  for(i=0;i<TRI_A;i++) Atri[i] = 0;
  for(i=0;i<MAXS;i++) { b[i]=0; c[i]=0; }
}

// --------- 5 different ODEs (scalar) ---------
// Case 1: y' = -y                      (stable linear)
// Case 2: y' =  y                      (unstable linear)
// Case 3: y' = -10y                    (stiffer linear)
// Case 4: y' = y*(1-y)                 (logistic nonlinear)
// Case 5: y' = -2y + sin(t)            (time-dependent forcing)
var f_ode(var t, var y)
{
  if(G_CASE == 1) return -y;
  if(G_CASE == 2) return  y;
  if(G_CASE == 3) return -10*y;
  if(G_CASE == 4) return y*(1-y);
  // case 5
  return -2*y + sin(t);
}

var rk_step(int s, float* Atri, float* b, float* c, var t, var y, var h)
{
  int i,j;

  if(s < 1) return y;
  if(s > MAXS) s = MAXS;

  for(i=0;i<s;i++) G_k[i] = 0;

  for(i=0;i<s;i++)
  {
    var yi = y;
    for(j=0;j<i;j++)
      yi += h * getA(Atri,i,j) * G_k[j];

    G_k[i] = f_ode(t + (var)c[i]*h, yi);
  }

  var y1 = y;
  for(i=0;i<s;i++)
    y1 += h * (var)b[i] * G_k[i];

  return y1;
}

var compose_step(int s1,float* A1,float* b1,float* c1,
                 int s2,float* A2,float* b2,float* c2,
                 var t, var y, var h)
{
  var y_mid = rk_step(s2,A2,b2,c2,t,y,h);
  var y_out = rk_step(s1,A1,b1,c1,t,y_mid,h);
  return y_out;
}

void permute_method_into(int s, float* A, float* b, float* c,
                         int* p,
                         int* sOut, float* Aout, float* bout, float* cout)
{
  int i,j;

  *sOut = s;
  if(*sOut < 0) *sOut = 0;
  if(*sOut > MAXS) *sOut = MAXS;

  for(i=0;i<TRI_A;i++) Aout[i] = 0;
  for(i=0;i<MAXS;i++)  { bout[i]=0; cout[i]=0; }

  for(i=0;i<*sOut;i++)
  {
    int pi = p[i];
    if(pi < 0) pi = 0;
    if(pi >= s) pi = s-1;
    bout[i] = b[pi];
    cout[i] = c[pi];
  }

  for(i=0;i<*sOut;i++)
    for(j=0;j<i;j++)
    {
      int pi = p[i];
      int pj = p[j];
      if(pi < 0) pi = 0;
      if(pj < 0) pj = 0;
      if(pi >= s) pi = s-1;
      if(pj >= s) pj = s-1;
      setA(Aout, i, j, getA(A, pi, pj));
    }
}

void make_heun_rk2()
{
  zeroTableau(&M1_s, M1_A, M1_b, M1_c);
  M1_s = 2;
  M1_c[0] = 0.0; M1_c[1] = 1.0;
  setA(M1_A, 1, 0, 1.0);
  M1_b[0] = 0.5; M1_b[1] = 0.5;
}

void make_rk4()
{
  zeroTableau(&M2_s, M2_A, M2_b, M2_c);
  M2_s = 4;

  M2_c[0]=0.0; M2_c[1]=0.5; M2_c[2]=0.5; M2_c[3]=1.0;
  setA(M2_A, 1, 0, 0.5);
  setA(M2_A, 2, 1, 0.5);
  setA(M2_A, 3, 2, 1.0);

  M2_b[0]=(float)(1./6.);
  M2_b[1]=(float)(1./3.);
  M2_b[2]=(float)(1./3.);
  M2_b[3]=(float)(1./6.);
}

void buildAll()
{
  make_heun_rk2();
  make_rk4();

  // example permutations (you can change these freely; theorem_382A says results stay the same)
  int p1[MAXS]; p1[0]=1; p1[1]=0; p1[2]=0; p1[3]=0;
  int p2[MAXS]; p2[0]=0; p2[1]=2; p2[2]=1; p2[3]=3;

  permute_method_into(M1_s, M1_A, M1_b, M1_c, p1, &M1B_s, M1B_A, M1B_b, M1B_c);
  permute_method_into(M2_s, M2_A, M2_b, M2_c, p2, &M2B_s, M2B_A, M2B_b, M2B_c);
}

void run_case(int k, var t, var y0, var h)
{
  G_CASE = k;

  var yA = compose_step(M1_s,  M1_A,  M1_b,  M1_c,
                        M2_s,  M2_A,  M2_b,  M2_c,
                        t, y0, h);

  var yB = compose_step(M1B_s, M1B_A, M1B_b, M1B_c,
                        M2B_s, M2B_A, M2B_b, M2B_c,
                        t, y0, h);

  printf("\nCase %d: yA=%.17g  yB=%.17g  |diff|=%.3e", k, yA, yB, abs(yA-yB));
}

function main()
{
  buildAll();

  // 5 different examples demonstrating theorem_382A invariance under stage permutation
  run_case(1, 0.0, 1.0, 0.1);   // stable linear
  run_case(2, 0.0, 1.0, 0.1);   // unstable linear
  run_case(3, 0.0, 1.0, 0.05);  // stiffer linear (smaller h)
  run_case(4, 0.0, 0.2, 0.1);   // nonlinear logistic
  run_case(5, 1.0, 1.0, 0.1);   // time-dependent forcing (t!=0)

  printf("\n");
  quit("done");
}

Last edited by TipmyPip; 1 hour ago.
Page 13 of 13 1 2 11 12 13

Moderated by  Petra 

Powered by UBB.threads™ PHP Forum Software 7.7.1