ZorroGPT

Gamestudio Links

Zorro Links

Newest Posts

what's the difference between WFOSelect and SelectWFO?
by Spirit. 01/31/26 18:48

ZorroGPT
by TipmyPip. 01/31/26 17:11

Lapsa's very own thread
by Lapsa. 01/30/26 19:44

Zorro version 3.0 prerelease!
by TipmyPip. 01/30/26 13:36

Historical Data with the 64bit FXCM Plugin
by Martin_HH. 01/28/26 18:45

SGT_FW
by Aku_Aku. 01/27/26 15:10

Buy A8 Pro Version
by Ezzett. 01/26/26 14:22

binance future Demo accounts not supported by zorro S?
by qin. 01/25/26 02:58

AUM Magazine

Latest Screens

Who's Online Now

5 registered members (Quad, TipmyPip, Ayumi, Spirit, 1 invisible), 7,456 guests, and 5 spiders.

Key: Admin, Global Mod, Mod

Newest Members

Sfrdragon, mayarik, Castromangos, Quantum, stephensdeborah
19195 Registered Users

Print Thread

Rate Thread

TorchBridge LineWorld Learner (Cuda Torch) [Re: TipmyPip] #489145
4 hours ago 4 hours ago

Joined: Sep 2017
Posts: 194

TipmyPip

OP
Member

TipmyPip

OP
Member

Joined: Sep 2017
Posts: 194

This source file is a training demonstration that runs inside a Zorro plug in. Picture a workshop where a learner explores a hallway and learns which direction leads to a goal. The opening section hardens the include environment so Windows macros and Zorro naming do not clash with LibTorch headers. It includes LibTorch first, then includes the Zorro header with a temporary rename so the word at does not collide with the tensor namespace.

Next comes LineWorld, a tiny environment. It stores a current position, a step counter, and a limit on how long an episode may run. Reset returns a fresh state. State builds a one hot tensor so only the current position is marked as active. Step applies an action, clamps the position within bounds, updates the counter, and returns the next state along with a reward and a done flag.

Experience is recorded in a replay buffer, a circular notebook with fixed capacity. Each entry holds a transition: state, action, reward, next state, and termination. Sampling pulls random memories so training is less correlated and more stable.

The learning brain is a neural network module with three linear layers and relu activations. An Agent owns a live network and a target network. The live network is optimized, while the target network is refreshed from time to time. Acting uses an epsilon greedy rule: sometimes it explores randomly, otherwise it chooses the action with the best predicted value. Training stacks a batch of tensors, gathers the value for each chosen action, computes a bootstrapped target from the target network, measures error with mean squared loss, and updates weights using Adam.

Finally, the exported main function is what Zorro calls. It sets up logging, seeds randomness, runs training loops for two agents, prints progress, and wraps everything in exception handling so the host process remains safe.

Small note about DLL loading: if compile64 bat does not add the LibTorch lib directory to PATH for Zorro runtime, Windows cannot find torch and c10 DLLs, and the plug in may fail to load properly.

Code

// ============================================================
//  Zorro DLL: LibTorch demo using ONLY main() (no run())
// ============================================================

#ifndef WIN32_LEAN_AND_MEAN
#define WIN32_LEAN_AND_MEAN
#endif

#ifndef NOMINMAX
#define NOMINMAX
#endif

// #include <windows.h>   // <-- REMOVED (not needed anymore if you don't call WinAPI here)

// --- 1) Include LibTorch FIRST ---
#include <torch/torch.h>

#include <vector>
#include <random>
#include <tuple>
#include <algorithm>
#include <cstdio>
#include <exception>
#include <cstdlib>

// --- 2) Include Zorro AFTER torch, rename Zorro's 'at' to avoid conflict ---
#define at zorro_at
#ifdef LOG
#undef LOG
#endif
#include <zorro.h>
#undef at

// --- 3) Cleanup common macro landmines ---
#ifdef min
#undef min
#endif
#ifdef max
#undef max
#endif
#ifdef ref
#undef ref
#endif
#ifdef swap
#undef swap
#endif
#ifdef abs
#undef abs
#endif

// ---------- Tiny 1D environment ----------
struct LineWorld {
  int n;
  int pos = 0;
  int maxSteps;
  int steps = 0;

  LineWorld(int n_, int maxSteps_) : n(n_), maxSteps(maxSteps_) {}

  torch::Tensor reset() {
    pos = 0;
    steps = 0;
    return state();
  }

  torch::Tensor state() const {
    auto s = torch::zeros({n}, torch::kFloat32);
    s.index_put_({pos}, 1.0f);
    return s;
  }

  struct StepResult {
    torch::Tensor next_state;
    float reward;
    bool done;
  };

  StepResult step(int action) {
    if (action == 0) pos = std::max(0, pos - 1);
    else             pos = std::min(n - 1, pos + 1);

    steps++;

    bool reached = (pos == n - 1);
    bool timeout = (steps >= maxSteps);
    bool done = reached || timeout;

    float reward = reached ? 1.0f : -0.01f;
    return { state(), reward, done };
  }
};

// ---------- Simple replay buffer ----------
struct Transition {
  torch::Tensor s;
  int a;
  float r;
  torch::Tensor ns;
  bool done;
};

struct ReplayBuffer {
  std::vector<Transition> data;
  size_t capacity;
  size_t idx = 0;
  bool filled = false;

  ReplayBuffer(size_t cap) : capacity(cap) {
    data.resize(capacity);
  }

  void push(const Transition& t) {
    data[idx] = t;
    idx = (idx + 1) % capacity;
    if (idx == 0) filled = true;
  }

  size_t size() const {
    return filled ? capacity : idx;
  }

  bool can_sample(size_t batch) const {
    return size() >= batch;
  }

  std::vector<Transition> sample(size_t batch, std::mt19937& rng) const {
    std::uniform_int_distribution<size_t> dist(0, size() - 1);
    std::vector<Transition> out;
    out.reserve(batch);
    for (size_t i = 0; i < batch; i++)
      out.push_back(data[dist(rng)]);
    return out;
  }
};

// ---------- Q Network ----------
struct QNetImpl : torch::nn::Module {
  torch::nn::Linear l1{nullptr}, l2{nullptr}, l3{nullptr};

  QNetImpl(int stateDim, int actionDim) {
    l1 = register_module("l1", torch::nn::Linear(stateDim, 64));
    l2 = register_module("l2", torch::nn::Linear(64, 64));
    l3 = register_module("l3", torch::nn::Linear(64, actionDim));
  }

  torch::Tensor forward(torch::Tensor x) {
    x = torch::relu(l1(x));
    x = torch::relu(l2(x));
    x = l3(x);
    return x;
  }
};
TORCH_MODULE(QNet);

static void hard_update(QNet& dst, QNet& src)
{
  torch::NoGradGuard ng;
  auto sp = src->parameters();
  auto dp = dst->parameters();
  for (size_t i = 0; i < sp.size(); i++)
    dp[i].copy_(sp[i]);
}

struct Agent {
  QNet q;
  QNet target;
  torch::optim::Adam opt;
  ReplayBuffer rb;

  float gamma = 0.99f;
  float eps = 1.0f;
  float epsMin = 0.05f;
  float epsDecay = 0.995f;

  Agent(int stateDim, int actionDim, float lr, size_t replayCap)
    : q(QNet(stateDim, actionDim)),
      target(QNet(stateDim, actionDim)),
      opt(q->parameters(), torch::optim::AdamOptions(lr)),
      rb(replayCap)
  {
    hard_update(target, q);
  }

  int act(const torch::Tensor& s, std::mt19937& rng) {
    std::uniform_real_distribution<float> u(0.0f, 1.0f);
    if (u(rng) < eps) {
      std::uniform_int_distribution<int> aDist(0, 1);
      return aDist(rng);
    }
    torch::NoGradGuard ng;
    auto qvals = q->forward(s);
    return qvals.argmax().item<int>();
  }

  void decay_epsilon() {
    eps *= epsDecay;
    if (eps < epsMin) eps = epsMin;
  }

  void train_step(size_t batchSize, std::mt19937& rng) {
    if (!rb.can_sample(batchSize)) return;

    auto batch = rb.sample(batchSize, rng);

    std::vector<torch::Tensor> ss, nss;
    std::vector<int64_t> aa;
    std::vector<float> rr, dd;

    for (const auto& t : batch) {
      ss.push_back(t.s);
      nss.push_back(t.ns);
      aa.push_back((int64_t)t.a);
      rr.push_back(t.r);
      dd.push_back(t.done ? 1.0f : 0.0f);
    }

    auto S  = torch::stack(ss);
    auto NS = torch::stack(nss);
    auto A  = torch::tensor(aa, torch::kInt64);
    auto R  = torch::tensor(rr, torch::kFloat32);
    auto D  = torch::tensor(dd, torch::kFloat32);

    auto qvals = q->forward(S);
    auto q_sa  = qvals.gather(1, A.unsqueeze(1)).squeeze(1);

    torch::Tensor next_q;
    {
      torch::NoGradGuard ng;
      next_q = std::get<0>(target->forward(NS).max(1));
    }
    auto y = R + gamma * next_q * (1.0f - D);

    auto loss = torch::mse_loss(q_sa, y);

    opt.zero_grad();
    loss.backward();
    opt.step();
  }

  void update_target() { hard_update(target, q); }
};

// ============================================================
//  Zorro calls this exported main() once (no run() used)
// ============================================================
extern "C" DLLFUNC int main()
{
  // (Optional: keep thread limits even without DLL path management)
  // _putenv_s("OMP_NUM_THREADS", "1");
  // _putenv_s("MKL_NUM_THREADS", "1");
  // _putenv_s("KMP_DUPLICATE_LIB_OK", "TRUE");
  // torch::set_num_threads(1);
  // torch::set_num_interop_threads(1);

  // Make printing work in Zorro-hosted DLLs
  setvbuf(stdout, nullptr, _IONBF, 0);

  FILE* f = fopen("Log\\mt6409_torch.txt", "a");
  if (f) { fprintf(f, "Started MT6409\n"); fflush(f); }

  try {
    printf("MT6409 Torch demo starting (main)...\n");

    torch::manual_seed(0);

    const int stateDim = 9;
    const int actionDim = 2;

    LineWorld env(stateDim, 30);
    std::mt19937 rng(123);

    Agent agent1(stateDim, actionDim, 1e-3f, 5000);
    Agent agent2(stateDim, actionDim, 1e-3f, 5000);

    const int episodes = 200;
    const int targetUpdateEvery = 20;
    const size_t batchSize = 64;

    float avg1 = 0.0f, avg2 = 0.0f;

    for (int ep = 1; ep <= episodes; ep++) {

      // Agent 1
      {
        auto s = env.reset();
        bool done = false;
        float total = 0.0f;

        while (!done) {
          int a = agent1.act(s, rng);
          auto res = env.step(a);
          agent1.rb.push({s, a, res.reward, res.next_state, res.done});
          agent1.train_step(batchSize, rng);

          total += res.reward;
          s = res.next_state;
          done = res.done;
        }

        agent1.decay_epsilon();
        avg1 = 0.95f * avg1 + 0.05f * total;
      }

      // Agent 2
      {
        auto s = env.reset();
        bool done = false;
        float total = 0.0f;

        while (!done) {
          int a = agent2.act(s, rng);
          auto res = env.step(a);
          agent2.rb.push({s, a, res.reward, res.next_state, res.done});
          agent2.train_step(batchSize, rng);

          total += res.reward;
          s = res.next_state;
          done = res.done;
        }

        agent2.decay_epsilon();
        avg2 = 0.95f * avg2 + 0.05f * total;
      }

      if (ep % targetUpdateEvery == 0) {
        agent1.update_target();
        agent2.update_target();
      }

      if (ep % 25 == 0) {
        printf("Episode %d | Agent1 avgR=%.4f eps=%.4f | Agent2 avgR=%.4f eps=%.4f\n",
               ep, avg1, agent1.eps, avg2, agent2.eps);

        if (f) {
          fprintf(f, "Episode %d | Agent1 avgR=%.4f eps=%.4f | Agent2 avgR=%.4f eps=%.4f\n",
                  ep, avg1, agent1.eps, avg2, agent2.eps);
          fflush(f);
        }
      }
    }

    printf("Done.\n");
    if (f) { fprintf(f, "Done MT6409\n"); fclose(f); }

    return 0;
  }
  catch (const c10::Error& e) {
    printf("TORCH c10::Error: %s\n", e.what());
  }
  catch (const std::exception& e) {
    printf("std::exception: %s\n", e.what());
  }
  catch (...) {
    printf("Unknown exception in main().\n");
  }

  if (f) fclose(f);
  return 1;
}

Last edited by TipmyPip; 4 hours ago.

ZorroGPT - https://bit.ly/3Gbsm4S

Entire Thread
Subject	Posted By	Posted
ZorroGPT	TipmyPip	11/19/23 11:26
Re: Zorro Trader GPT	thumper14	11/21/23 15:14
Re: Zorro Trader GPT	TipmyPip	11/22/23 05:08
Re: Zorro Trader GPT	AndrewAMD	11/22/23 12:25
Re: Zorro Trader GPT	TipmyPip	11/22/23 14:11
Re: Zorro Trader GPT	AndrewAMD	11/22/23 15:24
Re: Zorro Trader GPT	TipmyPip	11/22/23 16:53
Re: Zorro Trader GPT	NewtraderX	12/02/23 02:15
Re: Zorro Trader GPT	TipmyPip	12/02/23 06:10
Re: Zorro Trader GPT	fairtrader	12/04/23 09:16
Re: Zorro Trader GPT	TipmyPip	12/04/23 11:34
Re: Zorro Trader GPT	NewtraderX	12/10/23 20:09
Re: Zorro Trader GPT	TipmyPip	12/13/23 11:01
Re: Zorro Trader GPT	TipmyPip	12/28/23 14:22
Re: Zorro Trader GPT	scatters	02/13/24 16:24
Re: Zorro Trader GPT	TipmyPip	02/16/24 06:53
ZorroTraderGPT Update (2.6)	TipmyPip	03/06/24 09:27
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:43
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:53
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:59
Re: Zorro Trader GPT	TipmyPip	04/01/24 12:06
Re: Zorro Trader GPT	TipmyPip	04/01/24 12:23
Re: Zorro Trader GPT	TipmyPip	04/01/24 21:39
Re: Zorro Trader GPT	Smon	04/05/24 04:51
Re: Zorro Trader GPT	TipmyPip	04/05/24 06:35
Re: Zorro Trader GPT	Smon	04/06/24 05:12
Re: Zorro Trader GPT	TipmyPip	04/27/24 13:50
Re: Zorro Trader GPT	TipmyPip	04/06/24 08:11
Re: Zorro Trader GPT	TipmyPip	07/06/24 15:23
Gaussian Channel Adaptive Strategy	TipmyPip	07/06/24 15:31
Re: Gaussian Channel Adaptive Strategy	M_D	01/13/25 01:47
Re: Gaussian Channel Adaptive Strategy	TipmyPip	01/13/25 05:33
Multi-Factor Gaussian FX Strategy	TipmyPip	07/06/24 15:56
Gaussian Bands Strategy	TipmyPip	07/06/24 17:16
Gaussian Decision Tree Hedging Strategy	TipmyPip	07/11/24 05:43
Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	07/20/24 05:00
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	07/20/24 06:46
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	firecrest	10/01/24 10:42
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	10/04/24 19:33
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	firecrest	11/07/24 07:53
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	11/09/24 16:42
PyTorch DRL with ZorroGPT	TipmyPip	11/09/24 16:52
DRL direct Feed from Zorro	TipmyPip	11/09/24 17:07
Multi Agent DRL with Simple Stategy	TipmyPip	11/10/24 10:36
Delta Dynamics	TipmyPip	12/24/24 18:36
Delta Cycle Dynamics	TipmyPip	12/24/24 18:40
Î£ Î¦ - Î” Îž (t), Î£ Î¦ Î” Îž (âˆ«f(Ï„)dÏ„)	TipmyPip	12/24/24 19:47
Profit Path Conundrum	TipmyPip	12/25/24 14:23
Stochastic Correlation in Currency Pairs	TipmyPip	12/25/24 15:34
PCA and Stochastic Volatility	TipmyPip	12/25/24 17:43
Re: PCA and Stochastic Volatility	vicknick	01/02/25 06:37
Re: PCA and Stochastic Volatility	TipmyPip	01/02/25 06:51
Entangled Kernel Arbitrage	TipmyPip	01/02/25 09:02
Kernel Volatility Arbitrage Enigma	TipmyPip	01/02/25 09:36
The Volatility Feedback Arbitrage	TipmyPip	01/02/25 10:10
Kernelized Profitability Arbitrage	TipmyPip	01/02/25 10:39
Recursive Kernel PCA with GNN for Profitability Optimization	TipmyPip	01/02/25 12:39
Multi-GNN Recursive Kernel PCA	TipmyPip	01/02/25 18:41
Volatility-Driven Graph Signal Strategy	TipmyPip	01/07/25 11:17
Volatility Graph Dynamics	TipmyPip	01/07/25 12:10
Synergistic Graph-PCA Arbitrage (SGPA)	TipmyPip	01/09/25 17:57
Decoding the Dynamics of Volatility Interdependencies	TipmyPip	01/09/25 18:20
Quantum Market Entanglement Problem	TipmyPip	01/09/25 18:48
Stochastic Interdependent Volatility-Adaptive Signal	TipmyPip	01/09/25 19:46
Time-Series Volatility Clustering and Adaptive Trading Signals	TipmyPip	01/09/25 20:10
The Hydra's Awakening	TipmyPip	01/25/25 14:09
Re: The Hydra's Awakening	TipmyPip	01/26/25 19:53
Re: The Hydra's Awakening	OptimusPrime	01/30/25 00:38
Re: The Hydra's Awakening	TipmyPip	01/30/25 03:51
Re: Zorro Trader GPT	TipmyPip	01/30/25 04:43
The War of Shifting Fronts	TipmyPip	01/30/25 05:31
Re: Zorro Trader GPT	OptimusPrime	01/30/25 07:12
Hard Question	TipmyPip	01/30/25 07:37
Re: Hard Question	jcl	01/30/25 10:43
The War of Shifting Fronts (Part 2)	TipmyPip	01/30/25 12:58
The War of Shifting Fronts (Part 3)	TipmyPip	01/30/25 14:13
Re: Zorro Trader GPT	TipmyPip	01/30/25 17:24
Risk Diversification in Portfolio Optimization	TipmyPip	01/31/25 10:58
The Language of Symbols	TipmyPip	02/02/25 17:56
Portfolio Dynamically Allocates Capita	TipmyPip	02/11/25 15:37
Tale of the Five Guardians	TipmyPip	02/14/25 06:58
Re: Tale of the Five Guardians	M_D	02/19/25 01:25
Multi-File Converting csv to .t6	TipmyPip	02/14/25 19:42
The Lost Computation of Zorropolis	TipmyPip	02/19/25 03:24
Graph-Enhanced Directional Trading (GEDT)	TipmyPip	02/19/25 22:48
Optimal Execution Under Incomplete Information	TipmyPip	02/20/25 09:56
Re: Zorro Trader GPT	TipmyPip	02/20/25 23:12
Markov Chain and Stochastic Asset Transitions	TipmyPip	03/16/25 12:37
VWAP Indicator for Zorro	TipmyPip	03/26/25 23:22
ZorroGPT	TipmyPip	05/23/25 09:22
Market Manipulation Index (MMI) for Zorro	TipmyPip	07/09/25 13:41
enhMMI indicator	TipmyPip	07/20/25 06:53
Re: enhanced MMI	dBc	08/06/25 17:15
Re: enhanced MMI	TipmyPip	08/08/25 18:56
multi-timeframe “Market Mode Index”	TipmyPip	08/09/25 01:33
Murrey Math Lines	TipmyPip	08/21/25 05:43
The Strategy of Spiritual Love.	TipmyPip	09/01/25 17:20
The Breach of Algorithms	TipmyPip	09/01/25 18:14
Proportional Rule-Switching Agents (PRSA)	TipmyPip	09/01/25 21:35
Gate-and-Field Adaptive Engine (GFAE)	TipmyPip	09/04/25 16:56
Gate-and-Flow Adaptive Navigator	TipmyPip	09/06/25 00:26
Regime-Responsive Graph Rewiring of Influences	TipmyPip	09/06/25 19:35
Canticle of the Rewoven Mandala	TipmyPip	09/14/25 12:15
Canticle of the Rewoven Mandala	TipmyPip	09/14/25 12:28
Consensus Gate Orchestrator	TipmyPip	09/27/25 10:02
Consensus Gate Orchestrator (continue)	TipmyPip	09/27/25 10:05
The Serpent of Skew	TipmyPip	10/12/25 13:58
Re: The Serpent of Skew	turbodom	10/20/25 20:54
Re: The Serpent of Skew	TipmyPip	10/22/25 05:51
Re: The Serpent of Skew	TipmyPip	10/23/25 21:05
Empirical Analysis of Asset Prices	TipmyPip	11/08/25 22:28
Empirical Analysis of Asset Prices (with NN)	TipmyPip	11/09/25 08:39
Re: Empirical Analysis of Asset Prices (with NN)	TipmyPip	11/10/25 11:04
SAR MOMENTUM BOT	TipmyPip	11/19/25 10:10
Adaptive Slope Trend	TipmyPip	11/29/25 08:58
RegSlope Adaptive Control Slope Trend	TipmyPip	11/29/25 09:38
Entropy Edge Learner (EEL)	TipmyPip	01/20/26 07:17
EigenGlyph Cascade	TipmyPip	01/22/26 07:08
Twin Horizon Bandit Weaver	TipmyPip	01/22/26 09:32
Fractal Bandit Compass	TipmyPip	01/22/26 12:25
TwinPulse Fractal Accord	TipmyPip	01/24/26 03:55
ScaleWeave Sentinel	TipmyPip	01/24/26 04:10
Runge–Kutta methods	TipmyPip	01/24/26 07:26
RK-382A Neural ODE	TipmyPip	01/27/26 06:01
The Stage Shuffle Cache Saver	TipmyPip	01/28/26 09:08
The Next Stage.	TipmyPip	01/28/26 09:48
The Clockwork Storm Weaver	TipmyPip	13 hours ago
The Clockwork Storm Weaver (64Bit)	TipmyPip	12 hours ago
The Clockwork Storm Weaver (OpenCL)	TipmyPip	11 hours ago
The Clockwork Storm Weaver (CUDA)	TipmyPip	11 hours ago
TorchBridge LineWorld Learner (Cuda Torch)	TipmyPip	4 hours ago

Moderated by Petra