OP
Member
Joined: Sep 2017
Posts: 184
|
TwinPulse Fractal Accord is a two timeframe learning trader that blends fractal style pattern sensing with cooperative self tuning. On each bar it builds one price stream at a fast timeframe and another at a slower timeframe. From each stream it extracts three signals: a fractal roughness estimate, a linear trend slope, and a volatility of log returns. Those six signals feed a built in classifier that produces separate long and short tendencies. The trade engine compares them, turns the balance into a trade direction, and sizes exposure using adaptive leverage controls. Trades are also managed with a time based exit so positions do not linger past the learned holding duration. Instead of fixing the configuration, the strategy treats key settings as a team of bandit agents. Each agent controls one knob and can pick from a small list of discrete settings called arms. Agents cover the fast timeframe, the slow offset, feature window lengths on both timeframes, leverage scaling, maximum leverage, the prediction threshold, and the holding duration. At the start of an episode the agents choose arms, the strategy trades with that joint configuration, and when it returns to flat it measures the episode result. The reward is normalized by holding time and clipped, so one unusually large outcome does not dominate learning. Exploration is guided by an upper confidence rule. Each arm gets a score made from its current value estimate plus an uncertainty bonus that is larger when the arm has fewer samples. This encourages careful exploration early and more consistent exploitation later. A small random explore chance remains to keep coverage and to cope with regime changes, data quirks, and surprise shifts in market behavior. Communication between agents is implemented through coordination tables that store how two choices perform together beyond their individual quality. The most important table links the fast timeframe agent with the slow offset agent. The slow offset agent scores each candidate using its own estimate plus a gated pair term tied to the chosen fast arm. Symmetry is added by letting the fast timeframe agent evaluate its candidates while anticipating the best responding slow offset. After both are selected, a short negotiation step can reconsider the fast timeframe given the chosen offset, improving coherence between the two lenses. The same pattern is applied to other dependencies. Feature window agents communicate so fractal windows and slope windows on each timeframe tend to match a compatible scale, improving stability of the feature estimates. Risk control agents communicate so leverage scaling aligns with the prediction threshold, and maximum leverage aligns with holding duration, which shapes exposure time. Pair influence is gated. Pair terms begin weak and grow only after enough joint trials, preventing noisy early episodes from locking in fragile coordination. Pair learning is residual: each pair table updates only the extra benefit or harm not explained by the two single arm estimates, reducing double counting and sharpening credit assignment. Finally, a safety constraint blocks extreme combinations that would demand excessive warmup history or unstable feature computation, reducing wasted episodes and keeping training productive. Over many episodes the strategy becomes a small community of specialists that explore, share compatibility knowledge, and gradually converge on coordinated multi timeframe and risk settings that support steadier signals and cleaner trade timing overall, even when conditions drift. // ============================================================================
// Fractal Learner (Strategy 3) - RL Bandits for Parameters + 2TF ML (RETURNS)
// File: Fractal_EURUSD_2TF_RL_Bandits_v2_COMM_UCB_SYM_RES_SAFE.c (Zorro / lite-C)
//
// ADDED FEATURES (as requested):
// 1) Symmetric TF communication (TF1 "listens" too) via best-response scoring.
// 2) UCB exploration bonus (single + pair) instead of plain epsilon-greedy.
// 3) Reward normalization (per HoldBars) + clipping.
// 4) Pair-gating: pair influence ramps up only after enough samples.
// 5) More dependencies communicated:
// - LevScale ? PredThr
// - MaxLev ? HoldBars
// 6) Pair-table affects both directions (1-step coordinate descent negotiation).
// 7) Residual pair tables (prevents double counting of single Q):
// QpairRes(a,b) updated with reward - (Q_A(a)+Q_B(b)+QpairRes(a,b))
// 8) Safe constraints to avoid nonsense combos (reduces wasted episodes)
//
// lite-C safe:
// - No ternary operator (uses ifelse())
// - Header uses multiple file_append calls
// - strf format string is ONE literal
// ============================================================================
#define NPAR 12
#define MAXARMS 64
// Exploration / learning
#define EPSILON 0.10
#define ALPHA 0.10
// Communication strength (base)
#define LAMBDA_PAIR 0.30
// UCB constants
#define C_UCB 0.40
#define CP_UCB 0.40
// Pair gating (minimum samples before full lambda)
#define NMIN_PAIR 25
// Reward normalization/clipping
#define REWARD_CLIP 1000.0
// Safe constraint limit (keeps TF2 + windows from exploding warmup)
#define SAFE_LIMIT 2500
#define P_INT 0
#define P_VAR 1
// Parameter indices (RL-controlled)
#define P_TF1 0
#define P_TF2D 1
#define P_FDLEN1 2
#define P_SLOPELEN1 3
#define P_VOLLEN1 4
#define P_FDLEN2 5
#define P_SLOPELEN2 6
#define P_VOLLEN2 7
#define P_LEVSCALE 8
#define P_MAXLEV 9
#define P_PREDTHR 10
#define P_HOLDBARS 11
// -----------------------------
// RL storage
// -----------------------------
string ParName[NPAR];
int ParType[NPAR] =
{
P_INT, // TF1
P_INT, // TF2d
P_INT, // FDLen1
P_INT, // SlopeLen1
P_INT, // VolLen1
P_INT, // FDLen2
P_INT, // SlopeLen2
P_INT, // VolLen2
P_VAR, // LevScale
P_VAR, // MaxLev
P_VAR, // PredThr
P_INT // HoldBars
};
var ParMin[NPAR];
var ParMax[NPAR];
var ParStep[NPAR];
var Q[NPAR][MAXARMS];
int Ncnt[NPAR][MAXARMS];
int ArmsCount[NPAR];
int CurArm[NPAR];
// Totals for UCB (avoid summing every time)
int TotCnt[NPAR];
// -----------------------------
// Pairwise "communication" RESIDUAL tables
// (all are QpairRes; keep Npair and TotPair for gating+UCB)
// -----------------------------
// TF1 arm × TF2d arm
var Qpair_TF[MAXARMS][MAXARMS];
int Npair_TF[MAXARMS][MAXARMS];
int TotPair_TF;
// FDLen1 arm × SlopeLen1 arm
var Qpair_FD1SL1[MAXARMS][MAXARMS];
int Npair_FD1SL1[MAXARMS][MAXARMS];
int TotPair_FD1SL1;
// FDLen2 arm × SlopeLen2 arm
var Qpair_FD2SL2[MAXARMS][MAXARMS];
int Npair_FD2SL2[MAXARMS][MAXARMS];
int TotPair_FD2SL2;
// LevScale arm × PredThr arm
var Qpair_LS_PT[MAXARMS][MAXARMS];
int Npair_LS_PT[MAXARMS][MAXARMS];
int TotPair_LS_PT;
// MaxLev arm × HoldBars arm
var Qpair_ML_HB[MAXARMS][MAXARMS];
int Npair_ML_HB[MAXARMS][MAXARMS];
int TotPair_ML_HB;
// -----------------------------
// Utility
// -----------------------------
int calcArms(var mn, var mx, var stp)
{
if(stp <= 0) return 1;
int n = (int)floor((mx - mn)/stp + 1.000001);
if(n < 1) n = 1;
if(n > MAXARMS) n = MAXARMS;
return n;
}
var armValue(int p, int a)
{
var v = ParMin[p] + (var)a * ParStep[p];
if(v < ParMin[p]) v = ParMin[p];
if(v > ParMax[p]) v = ParMax[p];
if(ParType[p] == P_INT) v = (var)(int)(v + 0.5);
return v;
}
void initParNames()
{
ParName[P_TF1] = "TF1";
ParName[P_TF2D] = "TF2d";
ParName[P_FDLEN1] = "FDLen1";
ParName[P_SLOPELEN1] = "SlopeLen1";
ParName[P_VOLLEN1] = "VolLen1";
ParName[P_FDLEN2] = "FDLen2";
ParName[P_SLOPELEN2] = "SlopeLen2";
ParName[P_VOLLEN2] = "VolLen2";
ParName[P_LEVSCALE] = "LevScale";
ParName[P_MAXLEV] = "MaxLev";
ParName[P_PREDTHR] = "PredThr";
ParName[P_HOLDBARS] = "HoldBars";
}
// UCB bonus: c * sqrt( ln(1+Tot) / (1+n) )
var ucbBonus(int tot, int n, var c)
{
var lnT = log(1 + (var)tot);
return c * sqrt(lnT / (1 + (var)n));
}
// Pair gating: lambda_eff = lambda * min(1, nPair/nMin)
var lambdaEff(int nPair)
{
var frac = (var)nPair / (var)NMIN_PAIR;
if(frac > 1) frac = 1;
if(frac < 0) frac = 0;
return LAMBDA_PAIR * frac;
}
// -----------------------------
// Single-arm selection with UCB + epsilon explore
// -----------------------------
int bestArm_UCB(int p)
{
int a, best = 0;
var bestScore = Q[p][0] + ucbBonus(TotCnt[p], Ncnt[p][0], C_UCB);
for(a=1; a<ArmsCount[p]; a++)
{
var score = Q[p][a] + ucbBonus(TotCnt[p], Ncnt[p][a], C_UCB);
if(score > bestScore)
{
bestScore = score;
best = a;
}
}
return best;
}
int selectArm_UCB(int p)
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[p]);
return bestArm_UCB(p);
}
// Update single Q
void updateArm(int p, int a, var reward)
{
Q[p][a] = Q[p][a] + ALPHA*(reward - Q[p][a]);
Ncnt[p][a] += 1;
TotCnt[p] += 1;
}
// -----------------------------
// Residual pair update: QpairRes += alpha * (r - (QA + QB + QpairRes))
// -----------------------------
void updatePairRes_TF(int a1, int a2, var reward)
{
var pred = Q[P_TF1][a1] + Q[P_TF2D][a2] + Qpair_TF[a1][a2];
Qpair_TF[a1][a2] = Qpair_TF[a1][a2] + ALPHA*(reward - pred);
Npair_TF[a1][a2] += 1;
TotPair_TF += 1;
}
void updatePairRes_FD1SL1(int a1, int a2, var reward)
{
var pred = Q[P_FDLEN1][a1] + Q[P_SLOPELEN1][a2] + Qpair_FD1SL1[a1][a2];
Qpair_FD1SL1[a1][a2] = Qpair_FD1SL1[a1][a2] + ALPHA*(reward - pred);
Npair_FD1SL1[a1][a2] += 1;
TotPair_FD1SL1 += 1;
}
void updatePairRes_FD2SL2(int a1, int a2, var reward)
{
var pred = Q[P_FDLEN2][a1] + Q[P_SLOPELEN2][a2] + Qpair_FD2SL2[a1][a2];
Qpair_FD2SL2[a1][a2] = Qpair_FD2SL2[a1][a2] + ALPHA*(reward - pred);
Npair_FD2SL2[a1][a2] += 1;
TotPair_FD2SL2 += 1;
}
void updatePairRes_LS_PT(int a1, int a2, var reward)
{
var pred = Q[P_LEVSCALE][a1] + Q[P_PREDTHR][a2] + Qpair_LS_PT[a1][a2];
Qpair_LS_PT[a1][a2] = Qpair_LS_PT[a1][a2] + ALPHA*(reward - pred);
Npair_LS_PT[a1][a2] += 1;
TotPair_LS_PT += 1;
}
void updatePairRes_ML_HB(int a1, int a2, var reward)
{
var pred = Q[P_MAXLEV][a1] + Q[P_HOLDBARS][a2] + Qpair_ML_HB[a1][a2];
Qpair_ML_HB[a1][a2] = Qpair_ML_HB[a1][a2] + ALPHA*(reward - pred);
Npair_ML_HB[a1][a2] += 1;
TotPair_ML_HB += 1;
}
// -----------------------------
// Pair-aware selection with UCB + gating + residual pair
// score(b) = Q_B(b)+UCB + lambda_eff*QpairRes(a,b) + pairUCB
// -----------------------------
int bestArm_TF2d_given_TF1(int armTF1)
{
int a, best = 0;
int n0 = Npair_TF[armTF1][0];
var lam0 = lambdaEff(n0);
var pairB0 = lam0*Qpair_TF[armTF1][0] + ucbBonus(TotPair_TF, n0, CP_UCB);
var bestScore =
Q[P_TF2D][0] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][0], C_UCB) + pairB0;
for(a=1; a<ArmsCount[P_TF2D]; a++)
{
int n = Npair_TF[armTF1][a];
var lam = lambdaEff(n);
var pairB = lam*Qpair_TF[armTF1][a] + ucbBonus(TotPair_TF, n, CP_UCB);
var score =
Q[P_TF2D][a] + ucbBonus(TotCnt[P_TF2D], Ncnt[P_TF2D][a], C_UCB) + pairB;
if(score > bestScore)
{
bestScore = score;
best = a;
}
}
return best;
}
int selectArm_TF2d_given_TF1(int armTF1)
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[P_TF2D]);
return bestArm_TF2d_given_TF1(armTF1);
}
// TF1 symmetric best-response score:
// scoreTF1(a1) = Q_TF1(a1)+UCB + max_a2( lambda_eff*QpairRes(a1,a2) + pairUCB )
int bestArm_TF1_symmetric()
{
int a1, best = 0;
// compute bestScore for a1=0
var bestPairTerm0 = 0;
int a2;
for(a2=0; a2<ArmsCount[P_TF2D]; a2++)
{
int n = Npair_TF[0][a2];
var lam = lambdaEff(n);
var term = lam*Qpair_TF[0][a2] + ucbBonus(TotPair_TF, n, CP_UCB);
if(a2==0) bestPairTerm0 = term;
else if(term > bestPairTerm0) bestPairTerm0 = term;
}
var bestScore =
Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + bestPairTerm0;
for(a1=1; a1<ArmsCount[P_TF1]; a1++)
{
var bestPairTerm = 0;
for(a2=0; a2<ArmsCount[P_TF2D]; a2++)
{
int n2 = Npair_TF[a1][a2];
var lam2 = lambdaEff(n2);
var term2 = lam2*Qpair_TF[a1][a2] + ucbBonus(TotPair_TF, n2, CP_UCB);
if(a2==0) bestPairTerm = term2;
else if(term2 > bestPairTerm) bestPairTerm = term2;
}
var score =
Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + bestPairTerm;
if(score > bestScore)
{
bestScore = score;
best = a1;
}
}
return best;
}
int selectArm_TF1_symmetric()
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[P_TF1]);
return bestArm_TF1_symmetric();
}
// Coordinate descent step: re-pick TF1 given TF2d
int bestArm_TF1_given_TF2d(int armTF2d)
{
int a1, best = 0;
int n0 = Npair_TF[0][armTF2d];
var lam0 = lambdaEff(n0);
var pairB0 = lam0*Qpair_TF[0][armTF2d] + ucbBonus(TotPair_TF, n0, CP_UCB);
var bestScore =
Q[P_TF1][0] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][0], C_UCB) + pairB0;
for(a1=1; a1<ArmsCount[P_TF1]; a1++)
{
int n = Npair_TF[a1][armTF2d];
var lam = lambdaEff(n);
var pairB = lam*Qpair_TF[a1][armTF2d] + ucbBonus(TotPair_TF, n, CP_UCB);
var score =
Q[P_TF1][a1] + ucbBonus(TotCnt[P_TF1], Ncnt[P_TF1][a1], C_UCB) + pairB;
if(score > bestScore)
{
bestScore = score;
best = a1;
}
}
return best;
}
// Generic pair selectors for the other 4 pairs (FD1-SL1, FD2-SL2, LS-PT, ML-HB)
int bestArm_SlopeLen1_given_FDLen1(int armFD1)
{
int a, best = 0;
int n0 = Npair_FD1SL1[armFD1][0];
var lam0 = lambdaEff(n0);
var pairB0 = lam0*Qpair_FD1SL1[armFD1][0] + ucbBonus(TotPair_FD1SL1, n0, CP_UCB);
var bestScore =
Q[P_SLOPELEN1][0] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][0], C_UCB) + pairB0;
for(a=1; a<ArmsCount[P_SLOPELEN1]; a++)
{
int n = Npair_FD1SL1[armFD1][a];
var lam = lambdaEff(n);
var pairB = lam*Qpair_FD1SL1[armFD1][a] + ucbBonus(TotPair_FD1SL1, n, CP_UCB);
var score =
Q[P_SLOPELEN1][a] + ucbBonus(TotCnt[P_SLOPELEN1], Ncnt[P_SLOPELEN1][a], C_UCB) + pairB;
if(score > bestScore)
{
bestScore = score;
best = a;
}
}
return best;
}
int selectArm_SlopeLen1_given_FDLen1(int armFD1)
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[P_SLOPELEN1]);
return bestArm_SlopeLen1_given_FDLen1(armFD1);
}
int bestArm_SlopeLen2_given_FDLen2(int armFD2)
{
int a, best = 0;
int n0 = Npair_FD2SL2[armFD2][0];
var lam0 = lambdaEff(n0);
var pairB0 = lam0*Qpair_FD2SL2[armFD2][0] + ucbBonus(TotPair_FD2SL2, n0, CP_UCB);
var bestScore =
Q[P_SLOPELEN2][0] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][0], C_UCB) + pairB0;
for(a=1; a<ArmsCount[P_SLOPELEN2]; a++)
{
int n = Npair_FD2SL2[armFD2][a];
var lam = lambdaEff(n);
var pairB = lam*Qpair_FD2SL2[armFD2][a] + ucbBonus(TotPair_FD2SL2, n, CP_UCB);
var score =
Q[P_SLOPELEN2][a] + ucbBonus(TotCnt[P_SLOPELEN2], Ncnt[P_SLOPELEN2][a], C_UCB) + pairB;
if(score > bestScore)
{
bestScore = score;
best = a;
}
}
return best;
}
int selectArm_SlopeLen2_given_FDLen2(int armFD2)
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[P_SLOPELEN2]);
return bestArm_SlopeLen2_given_FDLen2(armFD2);
}
int bestArm_PredThr_given_LevScale(int armLS)
{
int a, best = 0;
int n0 = Npair_LS_PT[armLS][0];
var lam0 = lambdaEff(n0);
var pairB0 = lam0*Qpair_LS_PT[armLS][0] + ucbBonus(TotPair_LS_PT, n0, CP_UCB);
var bestScore =
Q[P_PREDTHR][0] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][0], C_UCB) + pairB0;
for(a=1; a<ArmsCount[P_PREDTHR]; a++)
{
int n = Npair_LS_PT[armLS][a];
var lam = lambdaEff(n);
var pairB = lam*Qpair_LS_PT[armLS][a] + ucbBonus(TotPair_LS_PT, n, CP_UCB);
var score =
Q[P_PREDTHR][a] + ucbBonus(TotCnt[P_PREDTHR], Ncnt[P_PREDTHR][a], C_UCB) + pairB;
if(score > bestScore)
{
bestScore = score;
best = a;
}
}
return best;
}
int selectArm_PredThr_given_LevScale(int armLS)
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[P_PREDTHR]);
return bestArm_PredThr_given_LevScale(armLS);
}
int bestArm_HoldBars_given_MaxLev(int armML)
{
int a, best = 0;
int n0 = Npair_ML_HB[armML][0];
var lam0 = lambdaEff(n0);
var pairB0 = lam0*Qpair_ML_HB[armML][0] + ucbBonus(TotPair_ML_HB, n0, CP_UCB);
var bestScore =
Q[P_HOLDBARS][0] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][0], C_UCB) + pairB0;
for(a=1; a<ArmsCount[P_HOLDBARS]; a++)
{
int n = Npair_ML_HB[armML][a];
var lam = lambdaEff(n);
var pairB = lam*Qpair_ML_HB[armML][a] + ucbBonus(TotPair_ML_HB, n, CP_UCB);
var score =
Q[P_HOLDBARS][a] + ucbBonus(TotCnt[P_HOLDBARS], Ncnt[P_HOLDBARS][a], C_UCB) + pairB;
if(score > bestScore)
{
bestScore = score;
best = a;
}
}
return best;
}
int selectArm_HoldBars_given_MaxLev(int armML)
{
if(random(1) < EPSILON)
return (int)random((var)ArmsCount[P_HOLDBARS]);
return bestArm_HoldBars_given_MaxLev(armML);
}
// -----------------------------
// Init ranges + tables
// -----------------------------
void initParamsRL()
{
// Ranges roughly matching your old optimize() ranges/steps
ParMin[P_TF1] = 1; ParMax[P_TF1] = 3; ParStep[P_TF1] = 1;
ParMin[P_TF2D] = 1; ParMax[P_TF2D] = 11; ParStep[P_TF2D] = 1;
ParMin[P_FDLEN1] = 20; ParMax[P_FDLEN1] = 220; ParStep[P_FDLEN1] = 5;
ParMin[P_SLOPELEN1] = 20; ParMax[P_SLOPELEN1] = 200; ParStep[P_SLOPELEN1] = 5;
ParMin[P_VOLLEN1] = 20; ParMax[P_VOLLEN1] = 200; ParStep[P_VOLLEN1] = 5;
ParMin[P_FDLEN2] = 10; ParMax[P_FDLEN2] = 160; ParStep[P_FDLEN2] = 5;
ParMin[P_SLOPELEN2] = 10; ParMax[P_SLOPELEN2] = 140; ParStep[P_SLOPELEN2] = 5;
ParMin[P_VOLLEN2] = 10; ParMax[P_VOLLEN2] = 140; ParStep[P_VOLLEN2] = 5;
ParMin[P_LEVSCALE] = 2; ParMax[P_LEVSCALE] = 30; ParStep[P_LEVSCALE] = 1;
ParMin[P_MAXLEV] = 0.1; ParMax[P_MAXLEV] = 1.0; ParStep[P_MAXLEV] = 0.1;
ParMin[P_PREDTHR] = 0.0; ParMax[P_PREDTHR] = 0.20; ParStep[P_PREDTHR] = 0.01;
ParMin[P_HOLDBARS] = 1; ParMax[P_HOLDBARS] = 30; ParStep[P_HOLDBARS] = 1;
int p, a;
for(p=0; p<NPAR; p++)
{
ArmsCount[p] = calcArms(ParMin[p], ParMax[p], ParStep[p]);
CurArm[p] = 0;
TotCnt[p] = 0;
for(a=0; a<ArmsCount[p]; a++)
{
Q[p][a] = 0;
Ncnt[p][a] = 0;
}
}
// Init pair tables
int i,j;
TotPair_TF = 0;
TotPair_FD1SL1 = 0;
TotPair_FD2SL2 = 0;
TotPair_LS_PT = 0;
TotPair_ML_HB = 0;
for(i=0; i<MAXARMS; i++)
{
for(j=0; j<MAXARMS; j++)
{
Qpair_TF[i][j] = 0; Npair_TF[i][j] = 0;
Qpair_FD1SL1[i][j] = 0; Npair_FD1SL1[i][j] = 0;
Qpair_FD2SL2[i][j] = 0; Npair_FD2SL2[i][j] = 0;
Qpair_LS_PT[i][j] = 0; Npair_LS_PT[i][j] = 0;
Qpair_ML_HB[i][j] = 0; Npair_ML_HB[i][j] = 0;
}
}
}
// -----------------------------
// Safe constraints helper
// -----------------------------
int safeOK(int TF1_arm, int TF2d_arm, int FDLen2_arm, int SlopeLen2_arm, int VolLen2_arm)
{
int TF1v = (int)armValue(P_TF1, TF1_arm);
int TF2dv = (int)armValue(P_TF2D, TF2d_arm);
int TF2v = TF1v + TF2dv;
if(TF2v > 12) TF2v = 12;
int FD2v = (int)armValue(P_FDLEN2, FDLen2_arm);
int SL2v = (int)armValue(P_SLOPELEN2, SlopeLen2_arm);
int VL2v = (int)armValue(P_VOLLEN2, VolLen2_arm);
int mx = FD2v;
if(SL2v > mx) mx = SL2v;
if(VL2v > mx) mx = VL2v;
// TF2 * max(window2) <= SAFE_LIMIT
if(TF2v * mx > SAFE_LIMIT)
return 0;
return 1;
}
// -----------------------------
// COMMUNICATING parameter pick order (symmetric + extra pairs + coord descent)
// -----------------------------
void pickParams()
{
int p;
// 1) TF1 symmetric pick (anticipate TF2d best response)
CurArm[P_TF1] = selectArm_TF1_symmetric();
// 2) TF2d conditioned on TF1 (pair-aware)
CurArm[P_TF2D] = selectArm_TF2d_given_TF1(CurArm[P_TF1]);
// 3) Coordinate descent: re-pick TF1 given TF2d (tiny negotiation step)
if(random(1) >= EPSILON) // keep some stochasticity
CurArm[P_TF1] = bestArm_TF1_given_TF2d(CurArm[P_TF2D]);
// 4) FD/Slope pair TF1
CurArm[P_FDLEN1] = selectArm_UCB(P_FDLEN1);
CurArm[P_SLOPELEN1] = selectArm_SlopeLen1_given_FDLen1(CurArm[P_FDLEN1]);
// 5) FD/Slope pair TF2
CurArm[P_FDLEN2] = selectArm_UCB(P_FDLEN2);
CurArm[P_SLOPELEN2] = selectArm_SlopeLen2_given_FDLen2(CurArm[P_FDLEN2]);
// 6) Remaining independent params first (we'll coordinate the extra pairs after)
for(p=0; p<NPAR; p++)
{
if(p == P_TF1) continue;
if(p == P_TF2D) continue;
if(p == P_FDLEN1) continue;
if(p == P_SLOPELEN1) continue;
if(p == P_FDLEN2) continue;
if(p == P_SLOPELEN2) continue;
CurArm[p] = selectArm_UCB(p);
}
// 7) Extra pairs communication:
// LevScale -> PredThr conditioned
CurArm[P_LEVSCALE] = selectArm_UCB(P_LEVSCALE);
CurArm[P_PREDTHR] = selectArm_PredThr_given_LevScale(CurArm[P_LEVSCALE]);
// MaxLev -> HoldBars conditioned
CurArm[P_MAXLEV] = selectArm_UCB(P_MAXLEV);
CurArm[P_HOLDBARS] = selectArm_HoldBars_given_MaxLev(CurArm[P_MAXLEV]);
// 8) Safe constraints: if violated, resample the most relevant arms a few times
int tries = 0;
while(tries < 20)
{
if(safeOK(CurArm[P_TF1], CurArm[P_TF2D], CurArm[P_FDLEN2], CurArm[P_SLOPELEN2], CurArm[P_VOLLEN2]))
break;
// resample TF2d (conditioned) and TF2-side windows
CurArm[P_TF2D] = selectArm_TF2d_given_TF1(CurArm[P_TF1]);
CurArm[P_FDLEN2] = selectArm_UCB(P_FDLEN2);
CurArm[P_SLOPELEN2] = selectArm_SlopeLen2_given_FDLen2(CurArm[P_FDLEN2]);
CurArm[P_VOLLEN2] = selectArm_UCB(P_VOLLEN2);
tries += 1;
}
}
// -----------------------------
// Feature helpers (lite-C safe)
// -----------------------------
function fractalDimKatz(vars P, int N)
{
if(N < 2) return 1.0;
var L = 0;
int i;
for(i=0; i<N-1; i++)
L += abs(P[i] - P[i+1]);
var d = 0;
for(i=1; i<N; i++)
{
var di = abs(P[i] - P[0]);
if(di > d) d = di;
}
if(L <= 0 || d <= 0) return 1.0;
var n = (var)N;
var fd = log(n) / (log(n) + log(d / L));
return clamp(fd, 1.0, 2.0);
}
function linSlope(vars P, int N)
{
if(N < 2) return 0;
var sumT=0, sumP=0, sumTT=0, sumTP=0;
int i;
for(i=0; i<N; i++)
{
var t = (var)i;
sumT += t;
sumP += P[i];
sumTT += t*t;
sumTP += t*P[i];
}
var denom = (var)N*sumTT - sumT*sumT;
if(abs(denom) < 1e-12) return 0;
return ((var)N*sumTP - sumT*sumP) / denom;
}
function stdevReturns(vars R, int N)
{
if(N < 2) return 0;
var mean = 0;
int i;
for(i=0; i<N; i++) mean += R[i];
mean /= (var)N;
var v = 0;
for(i=0; i<N; i++)
{
var d = R[i] - mean;
v += d*d;
}
v /= (var)(N-1);
return sqrt(max(0, v));
}
// ============================================================================
// RUN
// ============================================================================
function run()
{
BarPeriod = 5;
StartDate = 20100101;
EndDate = 0;
set(PLOTNOW|RULES|LOGFILE);
asset("EUR/USD");
algo("FRACTAL2TF_EUR_RL_v2_COMM_UCB_SYM");
var eps = 1e-12;
DataSplit = 50;
// LookBack must cover the MAX possible TF * MAX window.
LookBack = 3000;
// One-time init
static int Inited = 0;
static int PrevOpenTotal = 0;
static var LastBalance = 0;
static int Flip = 0;
string LogFN = "Log\\FRACTAL2TF_EUR_RL_v2_COMM_UCB_SYM.csv";
if(is(FIRSTINITRUN))
{
file_delete(LogFN);
file_append(LogFN,"Date,Time,Mode,Bar,");
file_append(LogFN,"TF1,TF2,TF2d,FDLen1,SlopeLen1,VolLen1,FDLen2,SlopeLen2,VolLen2,");
file_append(LogFN,"LevScale,MaxLev,PredThr,HoldBars,");
file_append(LogFN,"FD1,Slope1,Vol1,FD2,Slope2,Vol2,");
file_append(LogFN,"PredL,PredS,Pred,Lev,RewardNorm\n");
Inited = 0;
PrevOpenTotal = 0;
LastBalance = 0;
Flip = 0;
}
if(!Inited)
{
initParNames();
initParamsRL();
pickParams();
LastBalance = Balance;
PrevOpenTotal = NumOpenTotal;
Inited = 1;
}
// Convert chosen arms -> parameter values (current episode)
int TF1 = (int)armValue(P_TF1, CurArm[P_TF1]);
int TF2d = (int)armValue(P_TF2D, CurArm[P_TF2D]);
int TF2 = TF1 + TF2d;
if(TF2 > 12) TF2 = 12;
int FDLen1 = (int)armValue(P_FDLEN1, CurArm[P_FDLEN1]);
int SlopeLen1 = (int)armValue(P_SLOPELEN1, CurArm[P_SLOPELEN1]);
int VolLen1 = (int)armValue(P_VOLLEN1, CurArm[P_VOLLEN1]);
int FDLen2 = (int)armValue(P_FDLEN2, CurArm[P_FDLEN2]);
int SlopeLen2 = (int)armValue(P_SLOPELEN2, CurArm[P_SLOPELEN2]);
int VolLen2 = (int)armValue(P_VOLLEN2, CurArm[P_VOLLEN2]);
var LevScale = armValue(P_LEVSCALE, CurArm[P_LEVSCALE]);
var MaxLev = armValue(P_MAXLEV, CurArm[P_MAXLEV]);
var PredThr = armValue(P_PREDTHR, CurArm[P_PREDTHR]);
int HoldBars = (int)armValue(P_HOLDBARS, CurArm[P_HOLDBARS]);
// Build series (2 TF)
TimeFrame = TF1;
vars P1 = series(priceClose());
vars R1 = series(log(max(eps,P1[0]) / max(eps,P1[1])));
vars FD1S = series(0);
vars Slope1S = series(0);
vars Vol1S = series(0);
TimeFrame = TF2;
vars P2 = series(priceClose());
vars R2 = series(log(max(eps,P2[0]) / max(eps,P2[1])));
vars FD2S = series(0);
vars Slope2S = series(0);
vars Vol2S = series(0);
TimeFrame = 1;
// Warmup gate based on current episode params
int Need1 = max(max(FDLen1, SlopeLen1), VolLen1) + 5;
int Need2 = max(max(FDLen2, SlopeLen2), VolLen2) + 5;
int WarmupBars = max(TF1*Need1, TF2*Need2) + 10;
if(Bar < WarmupBars)
return;
// Do NOT block TRAIN during LOOKBACK
if(is(LOOKBACK) && !Train)
return;
// Compute features
TimeFrame = TF1;
FD1S[0] = fractalDimKatz(P1, FDLen1);
Slope1S[0] = linSlope(P1, SlopeLen1);
Vol1S[0] = stdevReturns(R1, VolLen1);
TimeFrame = TF2;
FD2S[0] = fractalDimKatz(P2, FDLen2);
Slope2S[0] = linSlope(P2, SlopeLen2);
Vol2S[0] = stdevReturns(R2, VolLen2);
TimeFrame = 1;
var Sig[6];
Sig[0] = FD1S[0];
Sig[1] = Slope1S[0];
Sig[2] = Vol1S[0];
Sig[3] = FD2S[0];
Sig[4] = Slope2S[0];
Sig[5] = Vol2S[0];
// Trading logic
int MethodBase = PERCEPTRON + FUZZY + BALANCED;
int MethodRet = MethodBase + RETURNS;
var PredL=0, PredS=0, Pred=0, Lev=0;
// time-based exit
if(NumOpenTotal > 0)
for(open_trades)
if(TradeIsOpen && TradeBars >= HoldBars)
exitTrade(ThisTrade);
if(Train)
{
// Forced alternating trades so ML always gets samples
if(NumOpenTotal == 0)
{
Flip = 1 - Flip;
LastBalance = Balance;
if(Flip)
{
adviseLong(MethodRet, 0, Sig, 6);
Lots = 1; enterLong();
}
else
{
adviseShort(MethodRet, 0, Sig, 6);
Lots = 1; enterShort();
}
}
}
else
{
PredL = adviseLong(MethodBase, 0, Sig, 6);
PredS = adviseShort(MethodBase, 0, Sig, 6);
// Bootstrap if model has no signal yet
if(NumOpenTotal == 0 && PredL == 0 && PredS == 0)
{
LastBalance = Balance;
var s = Sig[1] + Sig[4];
if(s > 0) { Lots=1; enterLong(); }
else if(s < 0) { Lots=1; enterShort(); }
else
{
if(random(1) < 0.5) { Lots=1; enterLong(); }
else { Lots=1; enterShort(); }
}
}
else
{
Pred = PredL - PredS;
Lev = clamp(Pred * LevScale, -MaxLev, MaxLev);
if(Lev > PredThr) { exitShort(); Lots=1; enterLong(); }
else if(Lev < -PredThr) { exitLong(); Lots=1; enterShort(); }
else { exitLong(); exitShort(); }
}
}
// RL reward + update (episode ends when we go from having positions to flat)
var RewardNorm = 0;
if(PrevOpenTotal > 0 && NumOpenTotal == 0)
{
var dBal = Balance - LastBalance;
// reward normalization: per (1+HoldBars)
RewardNorm = dBal / (1 + (var)HoldBars);
// clip reward
if(RewardNorm > REWARD_CLIP) RewardNorm = REWARD_CLIP;
if(RewardNorm < -REWARD_CLIP) RewardNorm = -REWARD_CLIP;
// --- Update singles (always, even if RewardNorm==0, for counts/UCB) ---
int p;
for(p=0; p<NPAR; p++)
updateArm(p, CurArm[p], RewardNorm);
// --- Update residual pair tables (COMMUNICATION LEARNING) ---
updatePairRes_TF(CurArm[P_TF1], CurArm[P_TF2D], RewardNorm);
updatePairRes_FD1SL1(CurArm[P_FDLEN1], CurArm[P_SLOPELEN1], RewardNorm);
updatePairRes_FD2SL2(CurArm[P_FDLEN2], CurArm[P_SLOPELEN2], RewardNorm);
updatePairRes_LS_PT(CurArm[P_LEVSCALE], CurArm[P_PREDTHR], RewardNorm);
updatePairRes_ML_HB(CurArm[P_MAXLEV], CurArm[P_HOLDBARS], RewardNorm);
// Next episode parameters
pickParams();
LastBalance = Balance;
}
PrevOpenTotal = NumOpenTotal;
// Logging
string ModeStr = "Trade";
if(Train) ModeStr = "Train";
else if(Test) ModeStr = "Test";
file_append(LogFN, strf("%04i-%02i-%02i,%02i:%02i,%s,%d,%d,%d,%d,%d,%d,%d,%d,%d,%d,%.6f,%.3f,%.6f,%d,%.6f,%.8f,%.8f,%.6f,%.8f,%.8f,%.8f,%.8f,%.8f,%.6f,%.6f\n",
year(0),month(0),day(0), hour(0),minute(0),
ModeStr, Bar,
TF1, TF2, TF2d,
FDLen1, SlopeLen1, VolLen1,
FDLen2, SlopeLen2, VolLen2,
LevScale, MaxLev, PredThr, HoldBars,
Sig[0], Sig[1], Sig[2], Sig[3], Sig[4], Sig[5],
PredL, PredS, Pred, Lev, RewardNorm
));
// Plots
plot("FD_TF1", Sig[0], NEW, 0);
plot("FD_TF2", Sig[3], 0, 0);
plot("Slope_TF1", Sig[1], 0, 0);
plot("Slope_TF2", Sig[4], 0, 0);
plot("Vol_TF1", Sig[2], 0, 0);
plot("Vol_TF2", Sig[5], 0, 0);
plot("Pred", Pred, 0, 0);
plot("Lev", Lev, 0, 0);
plot("RewardN", RewardNorm, 0, 0);
}
|