ZorroGPT

Gamestudio Links

Zorro Links

Newest Posts

Zorro not reconnecting to IB gateway after restart
by clint000. 03/10/26 21:30

Z9 with 3.01.4 - Error 047: not enough ticks
by jcl. 03/10/26 12:28

ZorroGPT
by TipmyPip. 03/08/26 18:50

â™ªâ™«â™ª [For hire] VOLKOVSTUDIO - Music, SFX, Voice over, implementa
by Volkovstudio. 03/05/26 17:28

Select folders dialog and select multiple files dialog
by AndrewAMD. 03/04/26 18:08

zorro with ccxt?
by opm. 03/03/26 03:17

WFO Training with parallel cores Zorro64
by Martin_HH. 02/26/26 16:03

Zorro version 3.0 prerelease!
by TipmyPip. 02/25/26 16:38

AUM Magazine

Latest Screens

Who's Online Now

3 registered members (TipmyPip, AndrewAMD, clint000), 8,485 guests, and 1 spider.

Key: Admin, Global Mod, Mod

Newest Members

the1, alx, ApprenticeInMuc, PatrickH90, USER0328
19200 Registered Users

Print Thread

Rate Thread

ScaleWeave Sentinel [Re: TipmyPip] #489119
01/24/26 04:10 01/24/26 04:10

Joined: Sep 2017
Posts: 282

TipmyPip

OP
Member

TipmyPip

OP
Member

Joined: Sep 2017
Posts: 282

This sentinel is a two pace fractal learner that tunes itself while it trades. It observes the market through a fast lens and a slow lens, then teaches those lenses to cooperate instead of competing. A bandit controller selects values for timeframes, feature windows, and risk controls. It balances exploration with exploitation by adding an optimism boost for options sampled less often. It records how combinations behave, not just single settings, so coordination improves over time.

Coordination begins with how the slow lens is represented. The strategy does not learn synergy on the offset used to reach the slow lens. It learns synergy on the derived slow timeframe that is actually used to build the slow price series. The fast timeframe and the offset are combined into that final slow timeframe, and cooperation feedback is stored against the pair formed from the fast choice and the final slow choice. If different offsets create the same slow view, they reinforce the same cooperation signal, keeping learning focused on the true slow scale and cleaner credit assignment.

The fast lens can also revise itself after the slow lens is chosen. This is a negotiation step. The fast side reevaluates candidates given the slow decision and the cooperation memories, and it may switch to a better matching fast setting. The revision is applied only sometimes, which preserves randomness and prevents early lock in. This back and forth approximates joint search without enumerating every full combination.

A second communication channel ties the slow timeframe to the slow side window sizes. Window lengths determine how much history the slow lens needs before its features become stable, and they also determine how long warmup lasts. Very large windows combined with very slow sampling can cause long delays, fewer learning episodes, and brittle behavior. Instead of rejecting and resampling until a safe combination appears, the strategy applies a safety cost during selection. When a candidate pairing implies excessive history demand, its score is reduced, steering choices toward feasible regions without hard bans.

To make this interaction explicit, the strategy keeps a dedicated memory table keyed by the slow timeframe and a binned summary of the largest slow window. The summary is the maximum among the slow fractal window, the slow slope window, and the slow volatility window. The table learns which pairings are productive from realized trade outcomes, capturing patterns such as slow sampling working best with modest windows, or moderate sampling supporting broader context without starving the learner of episodes.

Coordination is also improved at the feature level. Alongside the core fractal, slope, and volatility features from both lenses, the strategy adds three alignment tags. One tag describes trend agreement, reporting whether the fast and slow slopes point the same way, point opposite ways, or show no clear direction. A second tag describes volatility agreement, reporting whether the slow view is calmer or noisier relative to the fast view. A third tag describes structural contrast, reporting whether the slow fractal texture looks smoother or more intricate than the fast one.

A further channel links the slow timeframe to the holding horizon. Rewards are normalized by holding duration, and timeframe affects signal frequency, so the best slow sampling pace depends on how long positions are kept. The strategy therefore learns a cooperation table between the slow timeframe and the chosen holding setting. This reduces rhythm failures such as trading too often with long holds, or trading too slowly when the hold setting is short.

Cooperation memories are stored as residual effects. Each single setting has its own running value estimate, and each cooperation table stores only the extra contribution beyond those single values. Pair influence ramps up gradually as joint evidence accumulates, so early noise does not dominate. This keeps the communication signal honest and avoids double counting. Pair memories are updated only when a cycle closes, so the controller learns from outcomes. Rewards are normalized by holding time and clipped to prevent spikes from dominating. This keeps learning stable during training and testing.

During trading, the inner learner converts the feature snapshot into long and short preferences. Position intensity is scaled by a leverage factor, capped by a maximum exposure, and filtered by a threshold so weak signals stay flat. Exits are both signal driven and time driven, creating clear episodes that the controller can score. Over many episodes, ScaleWeave Sentinel becomes a stable coordinator that matches timeframes, windows, and holding rhythm while keeping behavior inside practical limits.

Code

// ============================================================================
// Fractal Learner (Strategy 3) - RL Bandits for Parameters + 2TF ML (RETURNS)
// File: Fractal_EURUSD_2TF_RL_Bandits_v3_TF2_W2_ALIGN_TF2HB_SOFTSAFE.c
//
// Adds requested improvements:
// 1) Communicate on derived TF2 (not TF2d): residual table Qpair_TF12(TF1_arm, TF2_idx)
// 2) Learn TF2 ? W2max interaction: Qpair_TF2_W2(TF2_idx, W2bin)
// 3) Alignment meta-features added to ML: Aslope, Asigma, AFD  (Sig size 9)
// 4) Learn TF2 ? HoldBars: Qpair_TF2_HB(TF2_idx, HoldBars_arm)
// 5) Replace rejection sampling with soft safety penalty in selection scoring
//
// lite-C safe:
// - No ternary operator (uses ifelse())
// - Header uses multiple file_append calls
// - strf format string is ONE literal
// ============================================================================

#define NPAR 12
#define MAXARMS 64

// Exploration / learning
#define EPSILON 0.10
#define ALPHA   0.10

// Communication strength (base)
#define LAMBDA_PAIR 0.30

// UCB constants
#define C_UCB   0.40
#define CP_UCB  0.40

// Pair gating (minimum samples before full lambda)
#define NMIN_PAIR 25

// Reward normalization/clipping
#define REWARD_CLIP 1000.0

// Safe constraint limit (keeps TF2 + windows from exploding warmup)
#define SAFE_LIMIT 2500

// Soft safety penalty weight (score -= ETA_SAFE * pi)
#define ETA_SAFE  2.0

// TF limits
#define TF_MAX 12

// W2 binning (based on your window2 ranges: 10..160 step 5)
#define W2_MIN  10
#define W2_MAX  160
#define W2_STEP 5
#define W2_BINS 31   // (160-10)/5 + 1 = 31

#define P_INT 0
#define P_VAR 1

// Parameter indices (RL-controlled)
#define P_TF1        0
#define P_TF2D       1
#define P_FDLEN1     2
#define P_SLOPELEN1  3
#define P_VOLLEN1    4
#define P_FDLEN2     5
#define P_SLOPELEN2  6
#define P_VOLLEN2    7
#define P_LEVSCALE   8
#define P_MAXLEV     9
#define P_PREDTHR    10
#define P_HOLDBARS   11

// -----------------------------
// RL storage
// -----------------------------
string ParName[NPAR];

int ParType[NPAR] =
{
	P_INT,  // TF1
	P_INT,  // TF2d
	P_INT,  // FDLen1
	P_INT,  // SlopeLen1
	P_INT,  // VolLen1
	P_INT,  // FDLen2
	P_INT,  // SlopeLen2
	P_INT,  // VolLen2
	P_VAR,  // LevScale
	P_VAR,  // MaxLev
	P_VAR,  // PredThr
	P_INT   // HoldBars
};

var ParMin[NPAR];
var ParMax[NPAR];
var ParStep[NPAR];

var Q[NPAR][MAXARMS];
int Ncnt[NPAR][MAXARMS];
int ArmsCount[NPAR];
int CurArm[NPAR];

// Totals for UCB (avoid summing every time)
int TotCnt[NPAR];

// -----------------------------
// Pairwise "communication" RESIDUAL tables
// -----------------------------

// (A) Derived TF2 communication: TF1_arm × TF2_idx(0..11)
var Qpair_TF12[MAXARMS][TF_MAX];
int Npair_TF12[MAXARMS][TF_MAX];
int TotPair_TF12;

// (B) TF2 ? W2max bin: TF2_idx(0..11) × W2bin(0..30)
var Qpair_TF2_W2[TF_MAX][W2_BINS];
int Npair_TF2_W2[TF_MAX][W2_BINS];
int TotPair_TF2_W2;

// (C) Existing FDLen1 ? SlopeLen1
var Qpair_FD1SL1[MAXARMS][MAXARMS];
int Npair_FD1SL1[MAXARMS][MAXARMS];
int TotPair_FD1SL1;

// (D) Existing FDLen2 ? SlopeLen2
var Qpair_FD2SL2[MAXARMS][MAXARMS];
int Npair_FD2SL2[MAXARMS][MAXARMS];
int TotPair_FD2SL2;

// (E) Existing LevScale ? PredThr
var Qpair_LS_PT[MAXARMS][MAXARMS];
int Npair_LS_PT[MAXARMS][MAXARMS];
int TotPair_LS_PT;

// (F) Existing MaxLev ? HoldBars
var Qpair_ML_HB[MAXARMS][MAXARMS];
int Npair_ML_HB[MAXARMS][MAXARMS];
int TotPair_ML_HB;

// (G) NEW: TF2 ? HoldBars_arm (approx TF2-HB factor)
var Qpair_TF2_HB[TF_MAX][MAXARMS];
int Npair_TF2_HB[TF_MAX][MAXARMS];
int TotPair_TF2_HB;

// -----------------------------
// Utility
// -----------------------------
int calcArms(var mn, var mx, var stp)
{
	if(stp <= 0) return 1;
	int n = (int)floor((mx - mn)/stp + 1.000001);
	if(n < 1) n = 1;
	if(n > MAXARMS) n = MAXARMS;
	return n;
}

var armValue(int p, int a)
{
	var v = ParMin[p] + (var)a * ParStep[p];
	if(v < ParMin[p]) v = ParMin[p];
	if(v > ParMax[p]) v = ParMax[p];
	if(ParType[p] == P_INT) v = (var)(int)(v + 0.5);
	return v;
}

void initParNames()
{
	ParName[P_TF1]       = "TF1";
	ParName[P_TF2D]      = "TF2d";
	ParName[P_FDLEN1]    = "FDLen1";
	ParName[P_SLOPELEN1] = "SlopeLen1";
	ParName[P_VOLLEN1]   = "VolLen1";
	ParName[P_FDLEN2]    = "FDLen2";
	ParName[P_SLOPELEN2] = "SlopeLen2";
	ParName[P_VOLLEN2]   = "VolLen2";
	ParName[P_LEVSCALE]  = "LevScale";
	ParName[P_MAXLEV]    = "MaxLev";
	ParName[P_PREDTHR]   = "PredThr";
	ParName[P_HOLDBARS]  = "HoldBars";
}

// UCB bonus: c * sqrt( ln(1+Tot) / (1+n) )
var ucbBonus(int tot, int n, var c)
{
	var lnT = log(1 + (var)tot);
	return c * sqrt(lnT / (1 + (var)n));
}

// Pair gating: lambda_eff = lambda * min(1, nPair/nMin)
var lambdaEff(int nPair)
{
	var frac = (var)nPair / (var)NMIN_PAIR;
	if(frac > 1) frac = 1;
	if(frac < 0) frac = 0;
	return LAMBDA_PAIR * frac;
}

// Derived TF2 value from TF1_arm and TF2d_arm
int tf2ValueFromArms(int armTF1, int armTF2d)
{
	int TF1v  = (int)armValue(P_TF1, armTF1);
	int TF2dv = (int)armValue(P_TF2D, armTF2d);
	int TF2v = TF1v + TF2dv;
	if(TF2v > TF_MAX) TF2v = TF_MAX;
	if(TF2v < 1) TF2v = 1;
	return TF2v;
}

// TF2 index 0..11
int tf2IndexFromValue(int TF2v)
{
	int idx = TF2v - 1;
	if(idx < 0) idx = 0;
	if(idx >= TF_MAX) idx = TF_MAX - 1;
	return idx;
}

// W2 bin index 0..30 from a window value
int w2BinFromValue(int w)
{
	if(w < W2_MIN) w = W2_MIN;
	if(w > W2_MAX) w = W2_MAX;
	return (int)((w - W2_MIN) / W2_STEP);
}

// Soft safety penalty pi(theta) = max(0, (TF2*W2max - SAFE_LIMIT)/SAFE_LIMIT)
var safetyPenalty(int TF2v, int W2maxv)
{
	var x = (var)(TF2v * W2maxv - SAFE_LIMIT);
	if(x <= 0) return 0;
	return x / (var)SAFE_LIMIT;
}

// sign helper: -1,0,1
var sign3(var x)
{
	if(x > 0) return 1;
	if(x < 0) return -1;
	return 0;
}

// -----------------------------
// Single-arm selection with UCB + epsilon explore
// -----------------------------
int bestArm_UCB(int p)
{
	int a, best = 0;
	var bestScore = Q[p][0] + ucbBonus(TotCnt[p], Ncnt[p][0], C_UCB);

	for(a=1; a<ArmsCount[p]; a++)
	{
		var score = Q[p][a] + ucbBonus(TotCnt[p], Ncnt[p][a], C_UCB);
		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_UCB(int p)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[p]);
	return bestArm_UCB(p);
}

// Update single Q
void updateArm(int p, int a, var reward)
{
	Q[p][a] = Q[p][a] + ALPHA*(reward - Q[p][a]);
	Ncnt[p][a] += 1;
	TotCnt[p] += 1;
}

// -----------------------------
// Residual pair updates
// -----------------------------

// (1) TF1 × TF2_idx residual:
// pred = Q_TF1(a1) + Qpair(a1,tf2)
void updatePairRes_TF12(int aTF1, int tf2Idx, var reward)
{
	var pred = Q[P_TF1][aTF1] + Qpair_TF12[aTF1][tf2Idx];
	Qpair_TF12[aTF1][tf2Idx] = Qpair_TF12[aTF1][tf2Idx] + ALPHA*(reward - pred);
	Npair_TF12[aTF1][tf2Idx] += 1;
	TotPair_TF12 += 1;
}

// (2) TF2_idx × W2bin (standalone interaction; no clean singles for derived vars)
// pred = Qpair(tf2,w2)
void updatePair_TF2_W2(int tf2Idx, int w2bin, var reward)
{
	var pred = Qpair_TF2_W2[tf2Idx][w2bin];
	Qpair_TF2_W2[tf2Idx][w2bin] = Qpair_TF2_W2[tf2Idx][w2bin] + ALPHA*(reward - pred);
	Npair_TF2_W2[tf2Idx][w2bin] += 1;
	TotPair_TF2_W2 += 1;
}

// (3) Existing residuals
void updatePairRes_FD1SL1(int a1, int a2, var reward)
{
	var pred = Q[P_FDLEN1][a1] + Q[P_SLOPELEN1][a2] + Qpair_FD1SL1[a1][a2];
	Qpair_FD1SL1[a1][a2] = Qpair_FD1SL1[a1][a2] + ALPHA*(reward - pred);
	Npair_FD1SL1[a1][a2] += 1;
	TotPair_FD1SL1 += 1;
}

void updatePairRes_FD2SL2(int a1, int a2, var reward)
{
	var pred = Q[P_FDLEN2][a1] + Q[P_SLOPELEN2][a2] + Qpair_FD2SL2[a1][a2];
	Qpair_FD2SL2[a1][a2] = Qpair_FD2SL2[a1][a2] + ALPHA*(reward - pred);
	Npair_FD2SL2[a1][a2] += 1;
	TotPair_FD2SL2 += 1;
}

void updatePairRes_LS_PT(int a1, int a2, var reward)
{
	var pred = Q[P_LEVSCALE][a1] + Q[P_PREDTHR][a2] + Qpair_LS_PT[a1][a2];
	Qpair_LS_PT[a1][a2] = Qpair_LS_PT[a1][a2] + ALPHA*(reward - pred);
	Npair_LS_PT[a1][a2] += 1;
	TotPair_LS_PT += 1;
}

void updatePairRes_ML_HB(int a1, int a2, var reward)
{
	var pred = Q[P_MAXLEV][a1] + Q[P_HOLDBARS][a2] + Qpair_ML_HB[a1][a2];
	Qpair_ML_HB[a1][a2] = Qpair_ML_HB[a1][a2] + ALPHA*(reward - pred);
	Npair_ML_HB[a1][a2] += 1;
	TotPair_ML_HB += 1;
}

// (4) NEW TF2_idx × HoldBars_arm residual:
// pred = Q_HB(aHB) + Qpair(tf2,aHB)
void updatePairRes_TF2_HB(int tf2Idx, int aHB, var reward)
{
	var pred = Q[P_HOLDBARS][aHB] + Qpair_TF2_HB[tf2Idx][aHB];
	Qpair_TF2_HB[tf2Idx][aHB] = Qpair_TF2_HB[tf2Idx][aHB] + ALPHA*(reward - pred);
	Npair_TF2_HB[tf2Idx][aHB] += 1;
	TotPair_TF2_HB += 1;
}

// -----------------------------
// Pair-aware selection pieces
// -----------------------------

// --- TF2d selection conditioned on TF1, but "listens" via derived TF2 index ---
// score = UCB_TF2d + lambda_eff*Qpair_TF12(TF1,TF2idx) + pairUCB(TF12)
int bestArm_TF2d_given_TF1_TF2(int armTF1)
{
	int a2d, best = 0;

	int TF2v0 = tf2ValueFromArms(armTF1, 0);
	int tf2Idx0 = tf2IndexFromValue(TF2v0);
	int n0 = Npair_TF12[armTF1][tf2Idx0];
	var lam0 = lambdaEff(n0);

	var pairB0 = lam0*Qpair_TF12[armTF1][tf2Idx0] + ucbBonus(TotPair_TF12, n0, CP_UCB);

	var bestScore =
		Q[P_TF2D][0] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][0], C_UCB) + pairB0;

	for(a2d=1; a2d<ArmsCount[P_TF2D]; a2d++)
	{
		int TF2v = tf2ValueFromArms(armTF1, a2d);
		int tf2Idx = tf2IndexFromValue(TF2v);

		int n = Npair_TF12[armTF1][tf2Idx];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_TF12[armTF1][tf2Idx] + ucbBonus(TotPair_TF12, n, CP_UCB);

		var score =
			Q[P_TF2D][a2d] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][a2d], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a2d;
		}
	}
	return best;
}

int selectArm_TF2d_given_TF1_TF2(int armTF1)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_TF2D]);
	return bestArm_TF2d_given_TF1_TF2(armTF1);
}

// --- TF1 symmetric: anticipates best TF2d response (via TF2 index) ---
int bestArm_TF1_symmetric_TF2()
{
	int a1, best = 0;

	// a1=0 baseline
	var bestPairTerm0 = 0;
	int a2d;
	for(a2d=0; a2d<ArmsCount[P_TF2D]; a2d++)
	{
		int TF2v = tf2ValueFromArms(0, a2d);
		int tf2Idx = tf2IndexFromValue(TF2v);

		int n = Npair_TF12[0][tf2Idx];
		var lam = lambdaEff(n);
		var term = lam*Qpair_TF12[0][tf2Idx] + ucbBonus(TotPair_TF12, n, CP_UCB);

		if(a2d==0) bestPairTerm0 = term;
		else if(term > bestPairTerm0) bestPairTerm0 = term;
	}

	var bestScore =
		Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + bestPairTerm0;

	for(a1=1; a1<ArmsCount[P_TF1]; a1++)
	{
		var bestPairTerm = 0;

		for(a2d=0; a2d<ArmsCount[P_TF2D]; a2d++)
		{
			int TF2v2 = tf2ValueFromArms(a1, a2d);
			int tf2Idx2 = tf2IndexFromValue(TF2v2);

			int n2 = Npair_TF12[a1][tf2Idx2];
			var lam2 = lambdaEff(n2);
			var term2 = lam2*Qpair_TF12[a1][tf2Idx2] + ucbBonus(TotPair_TF12, n2, CP_UCB);

			if(a2d==0) bestPairTerm = term2;
			else if(term2 > bestPairTerm) bestPairTerm = term2;
		}

		var score =
			Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + bestPairTerm;

		if(score > bestScore)
		{
			bestScore = score;
			best = a1;
		}
	}
	return best;
}

int selectArm_TF1_symmetric_TF2()
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_TF1]);
	return bestArm_TF1_symmetric_TF2();
}

// Coordinate descent refinement: TF1 given TF2d, using derived TF2 index
int bestArm_TF1_given_TF2d_TF2(int armTF2d)
{
	int a1, best = 0;

	int TF2v0 = tf2ValueFromArms(0, armTF2d);
	int tf2Idx0 = tf2IndexFromValue(TF2v0);
	int n0 = Npair_TF12[0][tf2Idx0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_TF12[0][tf2Idx0] + ucbBonus(TotPair_TF12, n0, CP_UCB);

	var bestScore =
		Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + pairB0;

	for(a1=1; a1<ArmsCount[P_TF1]; a1++)
	{
		int TF2v = tf2ValueFromArms(a1, armTF2d);
		int tf2Idx = tf2IndexFromValue(TF2v);

		int n = Npair_TF12[a1][tf2Idx];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_TF12[a1][tf2Idx] + ucbBonus(TotPair_TF12, n, CP_UCB);

		var score =
			Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a1;
		}
	}
	return best;
}

// --- TF2 ? W2max influence + soft safety applied during window2 selection ---

// Score contribution from TF2-W2 table:
var tf2w2Term(int tf2Idx, int w2bin)
{
	int n = Npair_TF2_W2[tf2Idx][w2bin];
	var lam = lambdaEff(n);
	return lam*Qpair_TF2_W2[tf2Idx][w2bin] + ucbBonus(TotPair_TF2_W2, n, CP_UCB);
}

// FDLen2 choice conditioned on TF2 (uses TF2-W2 term + penalty)
int bestArm_FDLen2_given_TF2(int tf2Idx, int TF2v)
{
	int a, best = 0;

	int v0 = (int)armValue(P_FDLEN2, 0);
	int w2bin0 = w2BinFromValue(v0);
	var pen0 = safetyPenalty(TF2v, v0);
	var bestScore =
		Q[P_FDLEN2][0] + ucbBonus(TotCnt[P_FDLEN2], Ncnt[P_FDLEN2][0], C_UCB)
		+ tf2w2Term(tf2Idx, w2bin0)
		- ETA_SAFE*pen0;

	for(a=1; a<ArmsCount[P_FDLEN2]; a++)
	{
		int v = (int)armValue(P_FDLEN2, a);
		int w2bin = w2BinFromValue(v);
		var pen = safetyPenalty(TF2v, v);

		var score =
			Q[P_FDLEN2][a] + ucbBonus(TotCnt[P_FDLEN2], Ncnt[P_FDLEN2][a], C_UCB)
			+ tf2w2Term(tf2Idx, w2bin)
			- ETA_SAFE*pen;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_FDLen2_given_TF2(int tf2Idx, int TF2v)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_FDLEN2]);
	return bestArm_FDLen2_given_TF2(tf2Idx, TF2v);
}

// SlopeLen2 conditioned on FDLen2 AND TF2 (FD2-SL2 pair + TF2-W2 + penalty)
int bestArm_SlopeLen2_given_FD2_TF2(int armFD2, int tf2Idx, int TF2v)
{
	int a, best = 0;

	int vFD = (int)armValue(P_FDLEN2, armFD2);

	// a=0 baseline
	int v0 = (int)armValue(P_SLOPELEN2, 0);
	int mx0 = vFD; if(v0 > mx0) mx0 = v0;
	int w2bin0 = w2BinFromValue(mx0);
	var pen0 = safetyPenalty(TF2v, mx0);

	// FD2-SL2 pair term
	int nP0 = Npair_FD2SL2[armFD2][0];
	var lamP0 = lambdaEff(nP0);
	var fdslTerm0 = lamP0*Qpair_FD2SL2[armFD2][0] + ucbBonus(TotPair_FD2SL2, nP0, CP_UCB);

	var bestScore =
		Q[P_SLOPELEN2][0] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][0], C_UCB)
		+ fdslTerm0
		+ tf2w2Term(tf2Idx, w2bin0)
		- ETA_SAFE*pen0;

	for(a=1; a<ArmsCount[P_SLOPELEN2]; a++)
	{
		int v = (int)armValue(P_SLOPELEN2, a);
		int mx = vFD; if(v > mx) mx = v;
		int w2bin = w2BinFromValue(mx);
		var pen = safetyPenalty(TF2v, mx);

		int nP = Npair_FD2SL2[armFD2][a];
		var lamP = lambdaEff(nP);
		var fdslTerm = lamP*Qpair_FD2SL2[armFD2][a] + ucbBonus(TotPair_FD2SL2, nP, CP_UCB);

		var score =
			Q[P_SLOPELEN2][a] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][a], C_UCB)
			+ fdslTerm
			+ tf2w2Term(tf2Idx, w2bin)
			- ETA_SAFE*pen;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_SlopeLen2_given_FD2_TF2(int armFD2, int tf2Idx, int TF2v)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_SLOPELEN2]);
	return bestArm_SlopeLen2_given_FD2_TF2(armFD2, tf2Idx, TF2v);
}

// VolLen2 conditioned on current W2max and TF2 (TF2-W2 + penalty)
int bestArm_VolLen2_given_TF2(int curMax, int tf2Idx, int TF2v)
{
	int a, best = 0;

	int v0 = (int)armValue(P_VOLLEN2, 0);
	int mx0 = curMax; if(v0 > mx0) mx0 = v0;
	int w2bin0 = w2BinFromValue(mx0);
	var pen0 = safetyPenalty(TF2v, mx0);

	var bestScore =
		Q[P_VOLLEN2][0] + ucbBonus(TotCnt[P_VOLLEN2], Ncnt[P_VOLLEN2][0], C_UCB)
		+ tf2w2Term(tf2Idx, w2bin0)
		- ETA_SAFE*pen0;

	for(a=1; a<ArmsCount[P_VOLLEN2]; a++)
	{
		int v = (int)armValue(P_VOLLEN2, a);
		int mx = curMax; if(v > mx) mx = v;
		int w2bin = w2BinFromValue(mx);
		var pen = safetyPenalty(TF2v, mx);

		var score =
			Q[P_VOLLEN2][a] + ucbBonus(TotCnt[P_VOLLEN2], Ncnt[P_VOLLEN2][a], C_UCB)
			+ tf2w2Term(tf2Idx, w2bin)
			- ETA_SAFE*pen;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_VolLen2_given_TF2(int curMax, int tf2Idx, int TF2v)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_VOLLEN2]);
	return bestArm_VolLen2_given_TF2(curMax, tf2Idx, TF2v);
}

// --- Existing FDLen1 ? SlopeLen1 (unchanged) ---
int bestArm_SlopeLen1_given_FDLen1(int armFD1)
{
	int a, best = 0;

	int n0 = Npair_FD1SL1[armFD1][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_FD1SL1[armFD1][0] + ucbBonus(TotPair_FD1SL1, n0, CP_UCB);

	var bestScore =
		Q[P_SLOPELEN1][0] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_SLOPELEN1]; a++)
	{
		int n = Npair_FD1SL1[armFD1][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_FD1SL1[armFD1][a] + ucbBonus(TotPair_FD1SL1, n, CP_UCB);

		var score =
			Q[P_SLOPELEN1][a] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_SlopeLen1_given_FDLen1(int armFD1)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_SLOPELEN1]);
	return bestArm_SlopeLen1_given_FDLen1(armFD1);
}

// --- LevScale -> PredThr (unchanged) ---
int bestArm_PredThr_given_LevScale(int armLS)
{
	int a, best = 0;

	int n0 = Npair_LS_PT[armLS][0];
	var lam0 = lambdaEff(n0);
	var pairB0 = lam0*Qpair_LS_PT[armLS][0] + ucbBonus(TotPair_LS_PT, n0, CP_UCB);

	var bestScore =
		Q[P_PREDTHR][0] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][0], C_UCB) + pairB0;

	for(a=1; a<ArmsCount[P_PREDTHR]; a++)
	{
		int n = Npair_LS_PT[armLS][a];
		var lam = lambdaEff(n);
		var pairB = lam*Qpair_LS_PT[armLS][a] + ucbBonus(TotPair_LS_PT, n, CP_UCB);

		var score =
			Q[P_PREDTHR][a] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][a], C_UCB) + pairB;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_PredThr_given_LevScale(int armLS)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_PREDTHR]);
	return bestArm_PredThr_given_LevScale(armLS);
}

// --- HoldBars selection conditioned on MaxLev AND TF2_idx (adds TF2-HB communication) ---
int bestArm_HoldBars_given_MaxLev_TF2(int armML, int tf2Idx)
{
	int a, best = 0;

	// ML-HB term for a=0
	int n0 = Npair_ML_HB[armML][0];
	var lam0 = lambdaEff(n0);
	var mlhb0 = lam0*Qpair_ML_HB[armML][0] + ucbBonus(TotPair_ML_HB, n0, CP_UCB);

	// TF2-HB term for a=0
	int nT0 = Npair_TF2_HB[tf2Idx][0];
	var lamT0 = lambdaEff(nT0);
	var tf2hb0 = lamT0*Qpair_TF2_HB[tf2Idx][0] + ucbBonus(TotPair_TF2_HB, nT0, CP_UCB);

	var bestScore =
		Q[P_HOLDBARS][0] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][0], C_UCB)
		+ mlhb0 + tf2hb0;

	for(a=1; a<ArmsCount[P_HOLDBARS]; a++)
	{
		int n1 = Npair_ML_HB[armML][a];
		var lam1 = lambdaEff(n1);
		var mlhb = lam1*Qpair_ML_HB[armML][a] + ucbBonus(TotPair_ML_HB, n1, CP_UCB);

		int nT = Npair_TF2_HB[tf2Idx][a];
		var lamT = lambdaEff(nT);
		var tf2hb = lamT*Qpair_TF2_HB[tf2Idx][a] + ucbBonus(TotPair_TF2_HB, nT, CP_UCB);

		var score =
			Q[P_HOLDBARS][a] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][a], C_UCB)
			+ mlhb + tf2hb;

		if(score > bestScore)
		{
			bestScore = score;
			best = a;
		}
	}
	return best;
}

int selectArm_HoldBars_given_MaxLev_TF2(int armML, int tf2Idx)
{
	if(random(1) < EPSILON)
		return (int)random((var)ArmsCount[P_HOLDBARS]);
	return bestArm_HoldBars_given_MaxLev_TF2(armML, tf2Idx);
}

// -----------------------------
// Init ranges + tables
// -----------------------------
void initParamsRL()
{
	// Ranges roughly matching your old optimize() ranges/steps
	ParMin[P_TF1]        = 1;    ParMax[P_TF1]        = 3;    ParStep[P_TF1]        = 1;
	ParMin[P_TF2D]       = 1;    ParMax[P_TF2D]       = 11;   ParStep[P_TF2D]       = 1;

	ParMin[P_FDLEN1]     = 20;   ParMax[P_FDLEN1]     = 220;  ParStep[P_FDLEN1]     = 5;
	ParMin[P_SLOPELEN1]  = 20;   ParMax[P_SLOPELEN1]  = 200;  ParStep[P_SLOPELEN1]  = 5;
	ParMin[P_VOLLEN1]    = 20;   ParMax[P_VOLLEN1]    = 200;  ParStep[P_VOLLEN1]    = 5;

	ParMin[P_FDLEN2]     = 10;   ParMax[P_FDLEN2]     = 160;  ParStep[P_FDLEN2]     = 5;
	ParMin[P_SLOPELEN2]  = 10;   ParMax[P_SLOPELEN2]  = 140;  ParStep[P_SLOPELEN2]  = 5;
	ParMin[P_VOLLEN2]    = 10;   ParMax[P_VOLLEN2]    = 140;  ParStep[P_VOLLEN2]    = 5;

	ParMin[P_LEVSCALE]   = 2;    ParMax[P_LEVSCALE]   = 30;   ParStep[P_LEVSCALE]   = 1;
	ParMin[P_MAXLEV]     = 0.1;  ParMax[P_MAXLEV]     = 1.0;  ParStep[P_MAXLEV]     = 0.1;
	ParMin[P_PREDTHR]    = 0.0;  ParMax[P_PREDTHR]    = 0.20; ParStep[P_PREDTHR]    = 0.01;
	ParMin[P_HOLDBARS]   = 1;    ParMax[P_HOLDBARS]   = 30;   ParStep[P_HOLDBARS]   = 1;

	int p, a;
	for(p=0; p<NPAR; p++)
	{
		ArmsCount[p] = calcArms(ParMin[p], ParMax[p], ParStep[p]);
		CurArm[p] = 0;
		TotCnt[p] = 0;
		for(a=0; a<ArmsCount[p]; a++)
		{
			Q[p][a] = 0;
			Ncnt[p][a] = 0;
		}
	}

	// Init pair tables
	int i,j;

	TotPair_TF12 = 0;
	TotPair_TF2_W2 = 0;
	TotPair_FD1SL1 = 0;
	TotPair_FD2SL2 = 0;
	TotPair_LS_PT = 0;
	TotPair_ML_HB = 0;
	TotPair_TF2_HB = 0;

	// TF12 table
	for(i=0; i<MAXARMS; i++)
	{
		for(j=0; j<TF_MAX; j++)
		{
			Qpair_TF12[i][j] = 0;
			Npair_TF12[i][j] = 0;
		}
	}

	// TF2-W2 table
	for(i=0; i<TF_MAX; i++)
	{
		for(j=0; j<W2_BINS; j++)
		{
			Qpair_TF2_W2[i][j] = 0;
			Npair_TF2_W2[i][j] = 0;
		}
	}

	// Large MAXARMS×MAXARMS tables
	for(i=0; i<MAXARMS; i++)
	{
		for(j=0; j<MAXARMS; j++)
		{
			Qpair_FD1SL1[i][j] = 0;  Npair_FD1SL1[i][j] = 0;
			Qpair_FD2SL2[i][j] = 0;  Npair_FD2SL2[i][j] = 0;
			Qpair_LS_PT[i][j]  = 0;  Npair_LS_PT[i][j]  = 0;
			Qpair_ML_HB[i][j]  = 0;  Npair_ML_HB[i][j]  = 0;
		}
	}

	// TF2-HB table: TF_MAX × MAXARMS
	for(i=0; i<TF_MAX; i++)
	{
		for(j=0; j<MAXARMS; j++)
		{
			Qpair_TF2_HB[i][j] = 0;
			Npair_TF2_HB[i][j] = 0;
		}
	}
}

// -----------------------------
// Parameter pick order (TF2-aware, TF2-W2 aware, TF2-HB aware, soft safety)
// -----------------------------
void pickParams()
{
	int p;

	// 1) TF1 symmetric pick (anticipate TF2d best response) using derived TF2 table
	CurArm[P_TF1] = selectArm_TF1_symmetric_TF2();

	// 2) TF2d conditioned on TF1 using derived TF2 communication
	CurArm[P_TF2D] = selectArm_TF2d_given_TF1_TF2(CurArm[P_TF1]);

	// 3) Coordinate descent: re-pick TF1 given TF2d
	if(random(1) >= EPSILON)
		CurArm[P_TF1] = bestArm_TF1_given_TF2d_TF2(CurArm[P_TF2D]);

	// Derived TF2 for the episode (used by TF2-side window selection and TF2-HB)
	int TF2v = tf2ValueFromArms(CurArm[P_TF1], CurArm[P_TF2D]);
	int tf2Idx = tf2IndexFromValue(TF2v);

	// 4) FD/Slope pair TF1
	CurArm[P_FDLEN1]    = selectArm_UCB(P_FDLEN1);
	CurArm[P_SLOPELEN1] = selectArm_SlopeLen1_given_FDLen1(CurArm[P_FDLEN1]);

	// 5) TF2-side windows are TF2-aware (TF2-W2) + FD2-SL2 communication + soft safety
	CurArm[P_FDLEN2]    = selectArm_FDLen2_given_TF2(tf2Idx, TF2v);
	CurArm[P_SLOPELEN2] = selectArm_SlopeLen2_given_FD2_TF2(CurArm[P_FDLEN2], tf2Idx, TF2v);

	// current max after FD2+SL2
	int vFD2 = (int)armValue(P_FDLEN2, CurArm[P_FDLEN2]);
	int vSL2 = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
	int curMax = vFD2; if(vSL2 > curMax) curMax = vSL2;

	CurArm[P_VOLLEN2]   = selectArm_VolLen2_given_TF2(curMax, tf2Idx, TF2v);

	// 6) Remaining independent params (except the paired ones we do later)
	for(p=0; p<NPAR; p++)
	{
		if(p == P_TF1) continue;
		if(p == P_TF2D) continue;
		if(p == P_FDLEN1) continue;
		if(p == P_SLOPELEN1) continue;
		if(p == P_FDLEN2) continue;
		if(p == P_SLOPELEN2) continue;
		if(p == P_VOLLEN2) continue;
		CurArm[p] = selectArm_UCB(p);
	}

	// 7) LevScale -> PredThr conditioned
	CurArm[P_LEVSCALE] = selectArm_UCB(P_LEVSCALE);
	CurArm[P_PREDTHR]  = selectArm_PredThr_given_LevScale(CurArm[P_LEVSCALE]);

	// 8) MaxLev -> HoldBars conditioned AND TF2-HB communication
	CurArm[P_MAXLEV]   = selectArm_UCB(P_MAXLEV);
	CurArm[P_HOLDBARS] = selectArm_HoldBars_given_MaxLev_TF2(CurArm[P_MAXLEV], tf2Idx);
}

// -----------------------------
// Feature helpers (lite-C safe)
// -----------------------------
function fractalDimKatz(vars P, int N)
{
	if(N < 2) return 1.0;

	var L = 0;
	int i;
	for(i=0; i<N-1; i++)
		L += abs(P[i] - P[i+1]);

	var d = 0;
	for(i=1; i<N; i++)
	{
		var di = abs(P[i] - P[0]);
		if(di > d) d = di;
	}

	if(L <= 0 || d <= 0) return 1.0;

	var n  = (var)N;
	var fd = log(n) / (log(n) + log(d / L));
	return clamp(fd, 1.0, 2.0);
}

function linSlope(vars P, int N)
{
	if(N < 2) return 0;

	var sumT=0, sumP=0, sumTT=0, sumTP=0;
	int i;
	for(i=0; i<N; i++)
	{
		var t = (var)i;
		sumT  += t;
		sumP  += P[i];
		sumTT += t*t;
		sumTP += t*P[i];
	}

	var denom = (var)N*sumTT - sumT*sumT;
	if(abs(denom) < 1e-12) return 0;

	return ((var)N*sumTP - sumT*sumP) / denom;
}

function stdevReturns(vars R, int N)
{
	if(N < 2) return 0;

	var mean = 0;
	int i;
	for(i=0; i<N; i++) mean += R[i];
	mean /= (var)N;

	var v = 0;
	for(i=0; i<N; i++)
	{
		var d = R[i] - mean;
		v += d*d;
	}
	v /= (var)(N-1);

	return sqrt(max(0, v));
}

// ============================================================================
// RUN
// ============================================================================
function run()
{
	BarPeriod = 15;
	StartDate = 20100101;
	EndDate   = 0;

	set(PLOTNOW|RULES|LOGFILE);

	asset("EUR/USD");
	algo("FRACTAL2TF");

	var eps = 1e-12;
	DataSplit = 50;

	LookBack = 3000;

	// One-time init
	static int Inited = 0;
	static int PrevOpenTotal = 0;
	static var LastBalance = 0;
	static int Flip = 0;

	string LogFN = "Log\\FRACTAL2TF.csv";

	if(is(FIRSTINITRUN))
	{
		file_delete(LogFN);

		file_append(LogFN,"Date,Time,Mode,Bar,");
		file_append(LogFN,"TF1,TF2,TF2d,FDLen1,SlopeLen1,VolLen1,FDLen2,SlopeLen2,VolLen2,");
		file_append(LogFN,"LevScale,MaxLev,PredThr,HoldBars,");
		file_append(LogFN,"FD1,Slope1,Vol1,FD2,Slope2,Vol2,Aslope,Asigma,AFD,");
		file_append(LogFN,"PredL,PredS,Pred,Lev,RewardNorm,TF2idx,W2max,W2bin\n");

		Inited = 0;
		PrevOpenTotal = 0;
		LastBalance = 0;
		Flip = 0;
	}

	if(!Inited)
	{
		initParNames();
		initParamsRL();
		pickParams();

		LastBalance = Balance;
		PrevOpenTotal = NumOpenTotal;

		Inited = 1;
	}

	// Convert chosen arms -> parameter values (current episode)
	int TF1 = (int)armValue(P_TF1, CurArm[P_TF1]);
	int TF2d = (int)armValue(P_TF2D, CurArm[P_TF2D]);
	int TF2 = TF1 + TF2d;
	if(TF2 > TF_MAX) TF2 = TF_MAX;

	int FDLen1    = (int)armValue(P_FDLEN1,    CurArm[P_FDLEN1]);
	int SlopeLen1 = (int)armValue(P_SLOPELEN1, CurArm[P_SLOPELEN1]);
	int VolLen1   = (int)armValue(P_VOLLEN1,   CurArm[P_VOLLEN1]);

	int FDLen2    = (int)armValue(P_FDLEN2,    CurArm[P_FDLEN2]);
	int SlopeLen2 = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
	int VolLen2   = (int)armValue(P_VOLLEN2,   CurArm[P_VOLLEN2]);

	var LevScale  = armValue(P_LEVSCALE, CurArm[P_LEVSCALE]);
	var MaxLev    = armValue(P_MAXLEV,   CurArm[P_MAXLEV]);
	var PredThr   = armValue(P_PREDTHR,  CurArm[P_PREDTHR]);
	int HoldBars  = (int)armValue(P_HOLDBARS, CurArm[P_HOLDBARS]);

	// Derived indices for logging + pair updates
	int tf2Idx = tf2IndexFromValue(TF2);

	int W2max = FDLen2;
	if(SlopeLen2 > W2max) W2max = SlopeLen2;
	if(VolLen2 > W2max) W2max = VolLen2;
	int W2bin = w2BinFromValue(W2max);

	// Build series (2 TF)
	TimeFrame = TF1;
	vars P1 = series(priceClose());
	vars R1 = series(log(max(eps,P1[0]) / max(eps,P1[1])));

	vars FD1S    = series(0);
	vars Slope1S = series(0);
	vars Vol1S   = series(0);

	TimeFrame = TF2;
	vars P2 = series(priceClose());
	vars R2 = series(log(max(eps,P2[0]) / max(eps,P2[1])));

	vars FD2S    = series(0);
	vars Slope2S = series(0);
	vars Vol2S   = series(0);

	TimeFrame = 1;

	// Warmup gate based on current episode params
	int Need1 = max(max(FDLen1, SlopeLen1), VolLen1) + 5;
	int Need2 = max(max(FDLen2, SlopeLen2), VolLen2) + 5;
	int WarmupBars = max(TF1*Need1, TF2*Need2) + 10;

	if(Bar < WarmupBars)
		return;

	// Do NOT block TRAIN during LOOKBACK
	if(is(LOOKBACK) && !Train)
		return;

	// Compute features
	TimeFrame = TF1;
	FD1S[0]    = fractalDimKatz(P1, FDLen1);
	Slope1S[0] = linSlope(P1, SlopeLen1);
	Vol1S[0]   = stdevReturns(R1, VolLen1);

	TimeFrame = TF2;
	FD2S[0]    = fractalDimKatz(P2, FDLen2);
	Slope2S[0] = linSlope(P2, SlopeLen2);
	Vol2S[0]   = stdevReturns(R2, VolLen2);

	TimeFrame = 1;

	// Alignment meta-features (Improvement 3)
	var Aslope = sign3(Slope1S[0]) * sign3(Slope2S[0]);          // -1,0,1
	var Asigma = Vol2S[0] / (Vol1S[0] + eps);                    // ratio
	var AFD    = FD2S[0] - FD1S[0];                              // contrast

	// Feature vector for ML: now 9 features
	var Sig[9];
	Sig[0] = FD1S[0];
	Sig[1] = Slope1S[0];
	Sig[2] = Vol1S[0];
	Sig[3] = FD2S[0];
	Sig[4] = Slope2S[0];
	Sig[5] = Vol2S[0];
	Sig[6] = Aslope;
	Sig[7] = Asigma;
	Sig[8] = AFD;

	// Trading logic
	int MethodBase = PERCEPTRON + FUZZY + BALANCED;
	int MethodRet  = MethodBase + RETURNS;

	var PredL=0, PredS=0, Pred=0, Lev=0;

	// time-based exit
	if(NumOpenTotal > 0)
		for(open_trades)
			if(TradeIsOpen && TradeBars >= HoldBars)
				exitTrade(ThisTrade);

	if(Train)
	{
		// Forced alternating trades so ML always gets samples
		if(NumOpenTotal == 0)
		{
			Flip = 1 - Flip;
			LastBalance = Balance;

			if(Flip)
			{
				adviseLong(MethodRet, 0, Sig, 9);
				Lots = 1; enterLong();
			}
			else
			{
				adviseShort(MethodRet, 0, Sig, 9);
				Lots = 1; enterShort();
			}
		}
	}
	else
	{
		PredL = adviseLong(MethodBase, 0, Sig, 9);
		PredS = adviseShort(MethodBase, 0, Sig, 9);

		// Bootstrap if model has no signal yet
		if(NumOpenTotal == 0 && PredL == 0 && PredS == 0)
		{
			LastBalance = Balance;
			var s = Sig[1] + Sig[4];
			if(s > 0) { Lots=1; enterLong(); }
			else if(s < 0) { Lots=1; enterShort(); }
			else
			{
				if(random(1) < 0.5) { Lots=1; enterLong(); }
				else                { Lots=1; enterShort(); }
			}
		}
		else
		{
			Pred = PredL - PredS;
			Lev  = clamp(Pred * LevScale, -MaxLev, MaxLev);

			if(Lev > PredThr)       { exitShort(); Lots=1; enterLong();  }
			else if(Lev < -PredThr) { exitLong();  Lots=1; enterShort(); }
			else                    { exitLong();  exitShort(); }
		}
	}

	// RL reward + update (episode ends when we go from having positions to flat)
	var RewardNorm = 0;

	if(PrevOpenTotal > 0 && NumOpenTotal == 0)
	{
		var dBal = Balance - LastBalance;

		// reward normalization: per (1+HoldBars)
		RewardNorm = dBal / (1 + (var)HoldBars);

		// clip reward
		if(RewardNorm >  REWARD_CLIP) RewardNorm =  REWARD_CLIP;
		if(RewardNorm < -REWARD_CLIP) RewardNorm = -REWARD_CLIP;

		// Update singles (always)
		int p;
		for(p=0; p<NPAR; p++)
			updateArm(p, CurArm[p], RewardNorm);

		// Derived TF2 index (based on CURRENT arms)
		int tf2v_now = tf2ValueFromArms(CurArm[P_TF1], CurArm[P_TF2D]);
		int tf2Idx_now = tf2IndexFromValue(tf2v_now);

		// W2max bin (based on CURRENT arms)
		int FD2v = (int)armValue(P_FDLEN2, CurArm[P_FDLEN2]);
		int SL2v = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
		int VL2v = (int)armValue(P_VOLLEN2, CurArm[P_VOLLEN2]);
		int w2mx = FD2v;
		if(SL2v > w2mx) w2mx = SL2v;
		if(VL2v > w2mx) w2mx = VL2v;
		int w2bin_now = w2BinFromValue(w2mx);

		// Update derived communication tables
		updatePairRes_TF12(CurArm[P_TF1], tf2Idx_now, RewardNorm);
		updatePair_TF2_W2(tf2Idx_now, w2bin_now, RewardNorm);
		updatePairRes_TF2_HB(tf2Idx_now, CurArm[P_HOLDBARS], RewardNorm);

		// Update existing residual pairs
		updatePairRes_FD1SL1(CurArm[P_FDLEN1], CurArm[P_SLOPELEN1], RewardNorm);
		updatePairRes_FD2SL2(CurArm[P_FDLEN2], CurArm[P_SLOPELEN2], RewardNorm);
		updatePairRes_LS_PT(CurArm[P_LEVSCALE], CurArm[P_PREDTHR], RewardNorm);
		updatePairRes_ML_HB(CurArm[P_MAXLEV], CurArm[P_HOLDBARS], RewardNorm);

		// Next episode parameters
		pickParams();
		LastBalance = Balance;
	}

	PrevOpenTotal = NumOpenTotal;

	// Logging
	string ModeStr = "Trade";
	if(Train) ModeStr = "Train";
	else if(Test) ModeStr = "Test";

	file_append(LogFN, strf("%04i-%02i-%02i,%02i:%02i,%s,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%.6f,%.3f,%.6f,%d,%.6f,%.8f,%.8f,%.6f,%.8f,%.8f,%.8f,%.6f,%.6f,%.6f,%.6f,%.6f,%d,%d,%d\n",
		year(0),month(0),day(0), hour(0),minute(0),
		ModeStr, Bar,
		TF1, TF2, TF2d,
		FDLen1, SlopeLen1, VolLen1,
		FDLen2, SlopeLen2, VolLen2,
		LevScale, MaxLev, PredThr, HoldBars,
		Sig[0], Sig[1], Sig[2], Sig[3], Sig[4], Sig[5],
		Sig[6], Sig[7], Sig[8],
		PredL, PredS, Pred, Lev, RewardNorm,
		tf2Idx, W2max, W2bin
	));

	// Plots
	plot("FD_TF1",    Sig[0], NEW, 0);
	plot("FD_TF2",    Sig[3], 0, 0);
	plot("Slope_TF1", Sig[1], 0, 0);
	plot("Slope_TF2", Sig[4], 0, 0);
	plot("Vol_TF1",   Sig[2], 0, 0);
	plot("Vol_TF2",   Sig[5], 0, 0);
	plot("Aslope",    Sig[6], 0, 0);
	plot("Asigma",    Sig[7], 0, 0);
	plot("AFD",       Sig[8], 0, 0);
	plot("Pred",      Pred, 0, 0);
	plot("Lev",       Lev, 0, 0);
	plot("RewardN",   RewardNorm, 0, 0);
}

ZorroGPT - https://bit.ly/3Gbsm4S

Entire Thread
Subject	Posted By	Posted
ZorroGPT	TipmyPip	11/19/23 11:26
Re: Zorro Trader GPT	thumper14	11/21/23 15:14
Re: Zorro Trader GPT	TipmyPip	11/22/23 05:08
Re: Zorro Trader GPT	AndrewAMD	11/22/23 12:25
Re: Zorro Trader GPT	TipmyPip	11/22/23 14:11
Re: Zorro Trader GPT	AndrewAMD	11/22/23 15:24
Re: Zorro Trader GPT	TipmyPip	11/22/23 16:53
Re: Zorro Trader GPT	NewtraderX	12/02/23 02:15
Re: Zorro Trader GPT	TipmyPip	12/02/23 06:10
Re: Zorro Trader GPT	fairtrader	12/04/23 09:16
Re: Zorro Trader GPT	TipmyPip	12/04/23 11:34
Re: Zorro Trader GPT	NewtraderX	12/10/23 20:09
Re: Zorro Trader GPT	TipmyPip	12/13/23 11:01
Re: Zorro Trader GPT	TipmyPip	12/28/23 14:22
Re: Zorro Trader GPT	scatters	02/13/24 16:24
Re: Zorro Trader GPT	TipmyPip	02/16/24 06:53
ZorroTraderGPT Update (2.6)	TipmyPip	03/06/24 09:27
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:43
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:53
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:59
Re: Zorro Trader GPT	TipmyPip	04/01/24 12:06
Re: Zorro Trader GPT	TipmyPip	04/01/24 12:23
Re: Zorro Trader GPT	TipmyPip	04/01/24 21:39
Re: Zorro Trader GPT	Smon	04/05/24 04:51
Re: Zorro Trader GPT	TipmyPip	04/05/24 06:35
Re: Zorro Trader GPT	Smon	04/06/24 05:12
Re: Zorro Trader GPT	TipmyPip	04/27/24 13:50
Re: Zorro Trader GPT	TipmyPip	04/06/24 08:11
Re: Zorro Trader GPT	TipmyPip	07/06/24 15:23
Gaussian Channel Adaptive Strategy	TipmyPip	07/06/24 15:31
Re: Gaussian Channel Adaptive Strategy	M_D	01/13/25 01:47
Re: Gaussian Channel Adaptive Strategy	TipmyPip	01/13/25 05:33
Multi-Factor Gaussian FX Strategy	TipmyPip	07/06/24 15:56
Gaussian Bands Strategy	TipmyPip	07/06/24 17:16
Gaussian Decision Tree Hedging Strategy	TipmyPip	07/11/24 05:43
Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	07/20/24 05:00
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	07/20/24 06:46
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	firecrest	10/01/24 10:42
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	10/04/24 19:33
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	firecrest	11/07/24 07:53
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	11/09/24 16:42
PyTorch DRL with ZorroGPT	TipmyPip	11/09/24 16:52
DRL direct Feed from Zorro	TipmyPip	11/09/24 17:07
Multi Agent DRL with Simple Stategy	TipmyPip	11/10/24 10:36
Delta Dynamics	TipmyPip	12/24/24 18:36
Delta Cycle Dynamics	TipmyPip	12/24/24 18:40
Î£ Î¦ - Î” Îž (t), Î£ Î¦ Î” Îž (âˆ«f(Ï„)dÏ„)	TipmyPip	12/24/24 19:47
Profit Path Conundrum	TipmyPip	12/25/24 14:23
Stochastic Correlation in Currency Pairs	TipmyPip	12/25/24 15:34
PCA and Stochastic Volatility	TipmyPip	12/25/24 17:43
Re: PCA and Stochastic Volatility	vicknick	01/02/25 06:37
Re: PCA and Stochastic Volatility	TipmyPip	01/02/25 06:51
Entangled Kernel Arbitrage	TipmyPip	01/02/25 09:02
Kernel Volatility Arbitrage Enigma	TipmyPip	01/02/25 09:36
The Volatility Feedback Arbitrage	TipmyPip	01/02/25 10:10
Kernelized Profitability Arbitrage	TipmyPip	01/02/25 10:39
Recursive Kernel PCA with GNN for Profitability Optimization	TipmyPip	01/02/25 12:39
Multi-GNN Recursive Kernel PCA	TipmyPip	01/02/25 18:41
Volatility-Driven Graph Signal Strategy	TipmyPip	01/07/25 11:17
Volatility Graph Dynamics	TipmyPip	01/07/25 12:10
Synergistic Graph-PCA Arbitrage (SGPA)	TipmyPip	01/09/25 17:57
Decoding the Dynamics of Volatility Interdependencies	TipmyPip	01/09/25 18:20
Quantum Market Entanglement Problem	TipmyPip	01/09/25 18:48
Stochastic Interdependent Volatility-Adaptive Signal	TipmyPip	01/09/25 19:46
Time-Series Volatility Clustering and Adaptive Trading Signals	TipmyPip	01/09/25 20:10
The Hydra's Awakening	TipmyPip	01/25/25 14:09
Re: The Hydra's Awakening	TipmyPip	01/26/25 19:53
Re: The Hydra's Awakening	OptimusPrime	01/30/25 00:38
Re: The Hydra's Awakening	TipmyPip	01/30/25 03:51
Re: Zorro Trader GPT	TipmyPip	01/30/25 04:43
The War of Shifting Fronts	TipmyPip	01/30/25 05:31
Re: Zorro Trader GPT	OptimusPrime	01/30/25 07:12
Hard Question	TipmyPip	01/30/25 07:37
Re: Hard Question	jcl	01/30/25 10:43
The War of Shifting Fronts (Part 2)	TipmyPip	01/30/25 12:58
The War of Shifting Fronts (Part 3)	TipmyPip	01/30/25 14:13
Re: Zorro Trader GPT	TipmyPip	01/30/25 17:24
Risk Diversification in Portfolio Optimization	TipmyPip	01/31/25 10:58
The Language of Symbols	TipmyPip	02/02/25 17:56
Portfolio Dynamically Allocates Capita	TipmyPip	02/11/25 15:37
Tale of the Five Guardians	TipmyPip	02/14/25 06:58
Re: Tale of the Five Guardians	M_D	02/19/25 01:25
Multi-File Converting csv to .t6	TipmyPip	02/14/25 19:42
The Lost Computation of Zorropolis	TipmyPip	02/19/25 03:24
Graph-Enhanced Directional Trading (GEDT)	TipmyPip	02/19/25 22:48
Optimal Execution Under Incomplete Information	TipmyPip	02/20/25 09:56
Re: Zorro Trader GPT	TipmyPip	02/20/25 23:12
Markov Chain and Stochastic Asset Transitions	TipmyPip	03/16/25 12:37
VWAP Indicator for Zorro	TipmyPip	03/26/25 23:22
ZorroGPT	TipmyPip	05/23/25 09:22
Market Manipulation Index (MMI) for Zorro	TipmyPip	07/09/25 13:41
enhMMI indicator	TipmyPip	07/20/25 06:53
Re: enhanced MMI	dBc	08/06/25 17:15
Re: enhanced MMI	TipmyPip	08/08/25 18:56
multi-timeframe “Market Mode Index”	TipmyPip	08/09/25 01:33
Murrey Math Lines	TipmyPip	08/21/25 05:43
The Strategy of Spiritual Love.	TipmyPip	09/01/25 17:20
The Breach of Algorithms	TipmyPip	09/01/25 18:14
Proportional Rule-Switching Agents (PRSA)	TipmyPip	09/01/25 21:35
Gate-and-Field Adaptive Engine (GFAE)	TipmyPip	09/04/25 16:56
Gate-and-Flow Adaptive Navigator	TipmyPip	09/06/25 00:26
Regime-Responsive Graph Rewiring of Influences	TipmyPip	09/06/25 19:35
Canticle of the Rewoven Mandala	TipmyPip	09/14/25 12:15
Canticle of the Rewoven Mandala	TipmyPip	09/14/25 12:28
Consensus Gate Orchestrator	TipmyPip	09/27/25 10:02
Consensus Gate Orchestrator (continue)	TipmyPip	09/27/25 10:05
The Serpent of Skew	TipmyPip	10/12/25 13:58
Re: The Serpent of Skew	turbodom	10/20/25 20:54
Re: The Serpent of Skew	TipmyPip	10/22/25 05:51
Re: The Serpent of Skew	TipmyPip	10/23/25 21:05
Empirical Analysis of Asset Prices	TipmyPip	11/08/25 22:28
Empirical Analysis of Asset Prices (with NN)	TipmyPip	11/09/25 08:39
Re: Empirical Analysis of Asset Prices (with NN)	TipmyPip	11/10/25 11:04
SAR MOMENTUM BOT	TipmyPip	11/19/25 10:10
Adaptive Slope Trend	TipmyPip	11/29/25 08:58
RegSlope Adaptive Control Slope Trend	TipmyPip	11/29/25 09:38
Entropy Edge Learner (EEL)	TipmyPip	01/20/26 07:17
EigenGlyph Cascade	TipmyPip	01/22/26 07:08
Twin Horizon Bandit Weaver	TipmyPip	01/22/26 09:32
Fractal Bandit Compass	TipmyPip	01/22/26 12:25
TwinPulse Fractal Accord	TipmyPip	01/24/26 03:55
ScaleWeave Sentinel	TipmyPip	01/24/26 04:10
Runge–Kutta methods	TipmyPip	01/24/26 07:26
RK-382A Neural ODE	TipmyPip	01/27/26 06:01
The Stage Shuffle Cache Saver	TipmyPip	01/28/26 09:08
The Next Stage.	TipmyPip	01/28/26 09:48
The Clockwork Storm Weaver	TipmyPip	01/31/26 08:02
The Clockwork Storm Weaver (64Bit)	TipmyPip	01/31/26 09:23
The Clockwork Storm Weaver (OpenCL)	TipmyPip	01/31/26 09:49
The Clockwork Storm Weaver (CUDA)	TipmyPip	01/31/26 10:01
TorchBridge LineWorld Learner (Cuda Torch)	TipmyPip	01/31/26 17:11
The Candle Oracle Lattice	TipmyPip	02/03/26 10:19
The Candle Oracle Lattice (cont.)	TipmyPip	02/03/26 10:21
The Candle Oracle Lattice (CUDA version)	TipmyPip	02/03/26 12:58
The Candle Oracle Lattice (CUDA version cont.)	TipmyPip	02/03/26 13:02
Re: The Candle Oracle Lattice (CUDA version cont.)	qin	02/04/26 23:52
The Candle Oracle Lattice (OpenCL version)	TipmyPip	02/05/26 09:39
The Candle Oracle Lattice (OpenCL cont.)	TipmyPip	02/05/26 11:14
Wiener index W(g)	TipmyPip	02/21/26 18:06
Microstructure Networks	TipmyPip	02/21/26 18:26
Decision Making Graph Geometry	TipmyPip	02/21/26 19:15
KnotScope FX	TipmyPip	02/23/26 18:47
CrowdAverse	TipmyPip	02/23/26 18:50
RegimeAtlas	TipmyPip	02/23/26 18:53
VolSieve	TipmyPip	02/23/26 18:56
MomentumBias Nexus	TipmyPip	02/23/26 18:59
NexusWeave Compact Dominant	TipmyPip	02/23/26 19:02
HermitNet FX	TipmyPip	02/23/26 19:08
AstraRegime Nexus	TipmyPip	02/23/26 19:57
NexusVol Navigator	TipmyPip	02/23/26 19:59
GraphPulse MomentumBias	TipmyPip	02/23/26 20:01
GraphWeaver-CL (openCL)	TipmyPip	02/23/26 20:57
CrowdAverseCL (openCL)	TipmyPip	02/23/26 21:07
RegimeWeaver (OpenCL)	TipmyPip	02/23/26 21:43
AetherWeave-28 (OpenCL)	TipmyPip	02/23/26 21:49
“Atlas Momentum Mesh” (OpenCL)	TipmyPip	02/23/26 21:52
EquiPulse DualGate	TipmyPip	02/25/26 12:15
CompactClaw Atlas	TipmyPip	02/25/26 21:36
CrowdAverse Nexus v4 (RL)	TipmyPip	02/25/26 21:44
GraphForge Regime Weaver v4 (RL)	TipmyPip	02/25/26 21:51
Momentum Loom Nexus	TipmyPip	02/25/26 22:00
NebulaSwitch Matrix v4 (RL)	TipmyPip	02/25/26 22:12
Momentum Loom Conductor v5 (RL)	TipmyPip	02/25/26 22:55
CompactClaw Constellation v5 (RL)	TipmyPip	02/25/26 23:07
CrowdAverse Nexus v5 (RL)	TipmyPip	02/25/26 23:21
PrismWeave Regime Switcher v5 (RL)	TipmyPip	02/26/26 11:26
VolWeave Constellation v5 (RL)	TipmyPip	02/26/26 11:29
Compactness Crown v6 (RL)	TipmyPip	02/26/26 11:39
CrowdAverse Prism v6 (RL)	TipmyPip	02/26/26 11:42
PrismLattice Switcher v6 (RL)	TipmyPip	02/26/26 11:47
VolSynapse Constellation v6 (RL)	TipmyPip	02/26/26 11:50
Momentum Loom Nexus v6 (RL)	TipmyPip	02/26/26 11:53
CompactCrown Matrix v7 (RL)	TipmyPip	02/26/26 11:56
CrowdAverse Prism v7 (RL)	TipmyPip	02/26/26 11:59
PrismWeave Regime Switcher v7 (RL)	TipmyPip	02/26/26 12:02
VolEdge Nexus v7 (RL)	TipmyPip	02/26/26 12:05
Momentum Loom Atlas v7 (RL)	TipmyPip	02/26/26 12:08
CompactPulse Constellation v8 (RL)	TipmyPip	02/26/26 12:14
CompactCrown Nexus v9 (RL)	TipmyPip	02/26/26 12:17
CrowdAverse Prism Nine v9 (RL)	TipmyPip	02/26/26 16:13
PrismWeave Regime Switcher v9 (RL)	TipmyPip	02/26/26 16:20
VolNet Sentinel Nine v9 (RL)	TipmyPip	02/26/26 16:25
PrismWeave Momentum Atlas v9 (RL)	TipmyPip	02/26/26 16:29
CompactCrown Navigator v10 (RL)	TipmyPip	02/26/26 16:34
CrowdAverse Nexus v10 (RL)	TipmyPip	02/26/26 16:37
PrismSwitch Nexus v10 (RL)	TipmyPip	02/26/26 16:43
VolMesh Sentinel v10 (RL)	TipmyPip	02/26/26 16:59
Momentum Weave Nexus v10 (RL)	TipmyPip	02/26/26 17:02
CompactClaw Nexus v11 (RL)	TipmyPip	02/26/26 17:06
CrowdAverse Lattice Engine	TipmyPip	02/26/26 17:11
Re: CrowdAverse Lattice Engine	TipmyPip	02/26/26 17:55
FractalViewport Conductor	TipmyPip	02/26/26 21:35
OrchidSwitch Nexus v11 (RL)	TipmyPip	02/27/26 13:15
VolCorr Atlas v.11 (RL)	TipmyPip	02/27/26 16:12
Momentum Weave Conductor v.11 (RL)	TipmyPip	02/27/26 16:22
NeuroWeave CompactNet v.12 (RL)	TipmyPip	02/27/26 16:31
CrowdAverse Neural Conductor v.12 (RL)	TipmyPip	02/27/26 16:40
Outperformance	TipmyPip	02/27/26 22:06
NeuroGraph Regime Compass v12 (RL)	TipmyPip	03/01/26 16:32
NeuroWeave VolAtlas v12 (RL)	TipmyPip	03/01/26 16:36
NeuroLattice Momentum Engine v12 (RL)	TipmyPip	03/01/26 16:40
Neural Prism Renderer	TipmyPip	03/02/26 16:21
TorchBridge Pixel Loom	TipmyPip	03/02/26 16:55
Stochastic TorchCanvas Bridge	TipmyPip	03/02/26 18:02
NeuroWeave Render Bridge	TipmyPip	03/02/26 18:13
CompactDominant Atlas v13 (RL)	TipmyPip	03/05/26 01:56
CompactDominant Atlas v13 (RL) cont.	TipmyPip	03/05/26 02:40
CrowdAverse Nexus Engine v13 (RL)	TipmyPip	03/05/26 02:49
CrowdAverse Nexus Engine v13 (RL) cont.	TipmyPip	03/05/26 02:58
The Stochastic Prism Engine v5	TipmyPip	03/06/26 07:18
Stable Archetype Library	TipmyPip	03/08/26 18:50

Moderated by Petra