ZorroGPT

Gamestudio Links

Zorro Links

Newest Posts

ZorroGPT
by TipmyPip. 03/05/26 02:58

Select folders dialog and select multiple files dialog
by AndrewAMD. 03/04/26 18:08

zorro with ccxt?
by opm. 03/03/26 03:17

Zorro not reconnecting to IB gateway after restart
by clint000. 02/28/26 22:12

â™ªâ™«â™ª [For hire] VOLKOVSTUDIO - Music, SFX, Voice over, implementa
by Volkovstudio. 02/26/26 17:11

WFO Training with parallel cores Zorro64
by Martin_HH. 02/26/26 16:03

Zorro version 3.0 prerelease!
by TipmyPip. 02/25/26 16:38

evaluation shell, is there a CSV file that stores the p-values
by Petra. 02/22/26 07:51

AUM Magazine

Latest Screens

Who's Online Now

3 registered members (TipmyPip, AndrewAMD, VoroneTZ), 5,640 guests, and 4 spiders.

Key: Admin, Global Mod, Mod

Newest Members

the1, alx, ApprenticeInMuc, PatrickH90, USER0328
19200 Registered Users

Print Thread

Rate Thread

CrowdAverse Nexus Engine v13 (RL) [Re: TipmyPip] #489294
4 hours ago 4 hours ago

Joined: Sep 2017
Posts: 280

TipmyPip

OP
Member

TipmyPip

OP
Member

Joined: Sep 2017
Posts: 280

This code is a Zorro strategy DLL that tries to pick a small set of currency pairs that are “least crowded” and most structurally independent, while constantly adapting its selection aggressiveness based on an internal learning controller. The strategy runs on an hourly bar schedule and treats the market as a network: each asset is a node, and the links between nodes represent similarity, either because the assets move together or because they share currency exposure. The goal is to score each asset as attractive when it is internally “coherent” but not overly synchronized with the rest of the crowd, then select a diversified Top K list. A major theme is performance and robustness: the heavy similarity computation can be accelerated by OpenCL if available, but the code always has a complete CPU fallback so it never depends on the GPU being present.

At startup, the strategy defines a fixed universe of twenty eight FX pairs and a set of currencies used for exposure logic. It also defines a large collection of feature, learning, and clustering toggles. A custom “slab allocator” is used for predictable memory allocation and fast reuse, which matters in a strategy loop. A “feature buffer” is implemented as a structure of arrays ring buffer: for each feature and each asset it stores a rolling window of feature values. That is the raw time series substrate from which correlations are built. Every bar, the code computes nine features per asset: short return, medium return, volatility, a z style deviation from older price, a range style displacement, a flow proxy that mixes return and volatility, a crude regime flag from volatility, a volatility of volatility proxy, and a persistence proxy based on absolute return. These features are pushed into the rolling buffer, so the strategy always has a recent window of history to compare assets consistently.

Every few bars, controlled by the update interval, the engine performs the expensive network step. First it builds a correlation matrix across assets. Conceptually, for each pair of assets and for each feature, it compares the two feature histories over the rolling window, measures how aligned they are, and averages that alignment across all features. This produces a single similarity value per asset pair that reflects multi feature co movement rather than just price correlation. On CPU this is a triple nested loop over features, assets, and time, which is heavy. The OpenCL path exists specifically to accelerate this step. The OpenCL backend is written in a defensive style: it loads OpenCL dynamically at runtime, queries a platform and a device, creates a context and queue, compiles a tiny kernel at runtime, and allocates two buffers: one for the packed feature window and one for the output correlations. If any step fails, it prints why and returns to CPU mode. The kernel itself assigns one work item to each asset pair and computes the correlation contribution across features and time. The GPU uses floats for speed, returns a packed correlation matrix, and the host copies it into the strategy’s correlation matrix. If the kernel call fails at runtime, OpenCL is disabled and the CPU path takes over.

Once correlations exist, the strategy converts similarity into distance. The distance matrix is built from two ingredients. The first is correlation distance, which increases as absolute correlation decreases, pushing highly synchronized pairs farther apart. The second is exposure distance from the exposure table, intended to represent how different the currency exposure is between two pairs. The code blends these two distances with a configurable meta mixing weight. This is important because in FX, pairs can be correlated simply because they share USD or JPY exposure, so an exposure aware distance discourages choosing many pairs that are effectively the same trade. The resulting distance matrix is then treated as a weighted graph.

To derive per asset “compactness,” the code runs a shortest path algorithm over this graph. It copies the distance matrix into a local square array and performs a Floyd Warshall style relaxation to compute all pairs shortest paths. That converts direct distances into effective distances that consider indirect paths through other assets. After that, each asset’s compactness is computed from the sum of its distances to others. Assets with small total distance to the rest are considered “central” or tightly connected in the network, while those with large total distance are more isolated. This compactness becomes one part of the scoring. In parallel, the strategy computes an entropy like measure per asset from the variance of its short return feature over the window. That entropy is a crude proxy for unpredictability or dispersion in that asset’s recent behavior.

The scoring step implements the “crowd averse” idea directly. For each asset, it computes a coupling term: the average compactness of the other assets it is connected to. If an asset lives in a region of the network where everything else is very compact and tightly linked, it is penalized as being “crowded.” The raw score combines three forces: reward higher compactness for the asset itself, reward higher entropy as a potential source of opportunity, and penalize high coupling to the crowd. This raw score is then squashed into a zero to one score so it can be compared and scaled cleanly. At this stage, the engine has a score for every asset at the update point.

On top of these network derived scores, the strategy layers a large “controller” that tries to adapt how aggressive the selection should be and how to scale scores depending on inferred regime. The controller begins with a simple unsupervised clustering over three snapshot statistics: mean score across assets, mean compactness across assets, and mean volatility. It maintains three centroids and assigns the current snapshot to the closest centroid, producing a coarse regime label and a confidence estimate. Then it optionally builds a lightweight PCA like representation over a sliding window of these snapshots. It normalizes features, derives three latent coordinates by fixed linear combinations, and estimates which latent dominates and how fast the dominance changes. Those two signals, dominance and rotation, are treated like “stability versus transition” indicators and are used to adjust aggressiveness.

The controller can also use probabilistic regime models. A Gaussian mixture model is implemented with diagonal variances and incremental updating. It produces regime probabilities, an entropy measure over those probabilities, and a confidence. A hidden Markov style filter is also implemented, including a transition matrix and online smoothing of emission parameters; it produces posterior probabilities, entropy, and an implied switch probability. When uncertainty is high or switching seems likely, the controller enters a cooldown where it becomes more conservative. In addition, a k means model runs with an online update and tracks stability based on how surprising the current distance to centroid is compared to its own recent distribution. When stability is low, risk is reduced further.

The controller translates these regime estimates into four adaptive multipliers that act like tuning knobs for how the base scoring philosophy should behave. It uses preset profiles per regime and blends them using regime probabilities. It then derives a global risk scale from regime uncertainty, so that high uncertainty means smaller risk. A tiny reinforcement learner sits on top, tracking four discrete actions that correspond to different selection intensities, updating action values based on whether the mean score improved since last update. This adds a slow meta adjustment layer that can prefer actions that historically improved outcomes.

A novelty detector adds another safety valve. An autoencoder like model takes a compact state vector that summarizes the system at each update, including mean score, mean compactness, mean volatility, controller scaling, current Top K, progress through the run, and whether OpenCL is active. It reconstructs the input through a small latent mapping and measures reconstruction error. If reconstruction error jumps above thresholds, the novelty controller interprets that as an unusual condition and forces risk down, reducing the number of selections and scaling scores lower. This makes the strategy more cautious precisely when the internal system state deviates from what it has recently considered normal.

There are also clustering and diversification helpers that influence how Top K is assembled rather than how each asset is scored. A hierarchical clustering model can cut the asset universe into coarse and fine clusters based on distances. A community detection model builds a graph from correlation magnitudes, optionally prunes to top links for determinism, runs a label propagation style update, and computes a modularity like value. That modularity is smoothed and used as another “market structure strength” signal: if community structure is weak, Top K is reduced; if community structure is strong, Top K may be increased, because diversification across strong communities is easier. The final selection step sorts assets by score, then tries to pick from different communities first, using coarse and fine caps so the chosen set is diversified rather than all from the same cluster.

Operationally, the run function integrates with Zorro’s lifecycle. On init it sets bar period and lookback, creates the strategy object, and initializes all buffers and models. On exit it shuts everything down cleanly and frees resources. During the run, once enough bars exist, every bar computes features and every update interval computes correlations, distances, shortest paths, scores, learns from the snapshot, applies novelty and regime scaling, and prints the current Top K list periodically. The overall behavior is therefore a looping pipeline: feature extraction produces a multi dimensional history, correlation converts history into a similarity network, distance and shortest paths convert the network into structure measures, scoring transforms structure into “crowd averse attractiveness,” and the learning controller adjusts how selective and risk averse the engine should be as conditions change.

If you want, I can also summarize it as a “flow chart” of stages or point out the single most expensive parts on CPU and how to reduce them (besides OpenCL).

Code

// TGr06B_CrowdAverse_v13.cpp - Zorro64 Strategy DLL
// Strategy B v13: Crowd-Averse with MX06 OOP + OpenCL + Learning Controller
//
// Notes:
// - Keeps full CPU fallback.
// - OpenCL is optional: if OpenCL.dll missing / no device / kernel build fails -> CPU path.
// - OpenCL accelerates the heavy correlation matrix step by offloading pairwise correlations.
// - Correlation is computed in float on GPU; results are stored back into fvar corrMatrix.

#define _CRT_SECURE_NO_WARNINGS
#include <zorro.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <windows.h>
#include <stddef.h>

#define INF 1e30
#define EPS 1e-12
#define N_ASSETS 28
#define FEAT_N 9
#define FEAT_WINDOW 200
#define UPDATE_EVERY 5
#define TOP_K 5

#define ALPHA 0.1
#define BETA 0.3
#define GAMMA 2.5
#define LAMBDA_META 0.5

#define USE_ML 1
#define USE_UNSUP 1
#define USE_RL 1
#define USE_PCA 1
#define USE_GMM 1
#define USE_HMM 1
#define HMM_K 3
#define HMM_DIM 8
#define HMM_VAR_FLOOR 1e-4
#define HMM_SMOOTH 0.02
#define HMM_ENTROPY_TH 0.85
#define HMM_SWITCH_TH 0.35
#define HMM_MIN_RISK 0.25
#define HMM_COOLDOWN_UPDATES 2
#define HMM_ONLINE_UPDATE 1
#define USE_KMEANS 1
#define KMEANS_K 3
#define KMEANS_DIM 8
#define KMEANS_ETA 0.03
#define KMEANS_DIST_EMA 0.08
#define KMEANS_STABILITY_MIN 0.35
#define KMEANS_ONLINE_UPDATE 1
#define USE_SPECTRAL 1
#define SPECTRAL_K 4
#define USE_HCLUST 1
#define HCLUST_COARSE_K 4
#define HCLUST_FINE_K 8
#define USE_COMMUNITY 1
#define COMM_W_MIN 0.15
#define COMM_TOPM 6
#define COMM_ITERS 4
#define COMM_Q_EMA 0.20
#define COMM_Q_LOW 0.20
#define COMM_Q_HIGH 0.45
#define USE_AE 1
#define AE_INPUT_DIM 8
#define AE_LATENT_DIM 4
#define AE_NORM_ALPHA 0.02
#define AE_ERR_EMA 0.10
#define AE_Z_LOW 1.0
#define AE_Z_HIGH 2.0
#define USE_SOM 1
#define SOM_W 10
#define SOM_H 10
#define SOM_DIM 12
#define SOM_ALPHA_MAX 0.30
#define SOM_ALPHA_MIN 0.05
#define SOM_SIGMA_MAX 5.0
#define SOM_SIGMA_MIN 1.0
#define SOM_CONF_MIN 0.15
#define SOM_ONLINE_UPDATE 1
#define GMM_K 3
#define GMM_DIM 8
#define GMM_ALPHA 0.02
#define GMM_VAR_FLOOR 1e-4
#define GMM_ENTROPY_COEFF 0.45
#define GMM_MIN_RISK 0.25
#define GMM_ONLINE_UPDATE 1
#define STRATEGY_PROFILE 1
#define PCA_DIM 6
#define PCA_COMP 3
#define PCA_WINDOW 128
#define PCA_REBUILD_EVERY 4

#ifdef TIGHT_MEM
typedef float fvar;
#else
typedef double fvar;
#endif

static const char* ASSET_NAMES[] = {
  "EURUSD","GBPUSD","USDCHF","USDJPY","AUDUSD","AUDCAD","AUDCHF","AUDJPY","AUDNZD",
  "CADJPY","CADCHF","EURAUD","EURCAD","EURCHF","EURGBP","EURJPY","EURNZD","GBPAUD",
  "GBPCAD","GBPCHF","GBPJPY","GBPNZD","NZDCAD","NZDCHF","NZDJPY","NZDUSD","USDCAD"
};
static const char* CURRENCIES[] = {"EUR","GBP","USD","CHF","JPY","AUD","CAD","NZD"};
#define N_CURRENCIES 8

// ---------------------------- Exposure Table ----------------------------

struct ExposureTable {
  int exposure[N_ASSETS][N_CURRENCIES];
  double exposureDist[N_ASSETS][N_ASSETS];

  void init() {
    for(int i=0;i<N_ASSETS;i++){
      for(int c=0;c<N_CURRENCIES;c++){
        exposure[i][c] = 0;
      }
    }
    for(int i=0;i<N_ASSETS;i++){
      for(int j=0;j<N_ASSETS;j++){
        exposureDist[i][j] = 0.0;
      }
    }
  }

  inline double getDist(int i,int j) const { return exposureDist[i][j]; }
};

// ---------------------------- Slab Allocator ----------------------------

template<typename T>
class SlabAllocator {
public:
  T* data;
  int capacity;

  SlabAllocator() : data(NULL), capacity(0) {}
  ~SlabAllocator() { shutdown(); }

  void init(int size) {
    shutdown();
    capacity = size;
    data = (T*)malloc((size_t)capacity * sizeof(T));
    if(data) memset(data, 0, (size_t)capacity * sizeof(T));
  }

  void shutdown() {
    if(data) free(data);
    data = NULL;
    capacity = 0;
  }

  T& operator[](int i) { return data[i]; }
  const T& operator[](int i) const { return data[i]; }
};

// ---------------------------- Feature Buffer (SoA ring) ----------------------------

struct FeatureBufferSoA {
  SlabAllocator<fvar> buffer;
  int windowSize;
  int currentIndex;

  void init(int assets, int window) {
    windowSize = window;
    currentIndex = 0;
    buffer.init(FEAT_N * assets * window);
  }

  void shutdown() { buffer.shutdown(); }

  inline int offset(int feat,int asset,int t) const {
    return (feat * N_ASSETS + asset) * windowSize + t;
  }

  void push(int feat,int asset,fvar value) {
    buffer[offset(feat, asset, currentIndex)] = value;
    currentIndex = (currentIndex + 1) % windowSize;
  }

  // t=0 => most recent
  fvar get(int feat,int asset,int t) const {
    int idx = (currentIndex - 1 - t + windowSize) % windowSize;
    return buffer[offset(feat, asset, idx)];
  }
};

// ---------------------------- Minimal OpenCL (dynamic) ----------------------------

typedef struct _cl_platform_id*   cl_platform_id;
typedef struct _cl_device_id*     cl_device_id;
typedef struct _cl_context*       cl_context;
typedef struct _cl_command_queue* cl_command_queue;
typedef struct _cl_program*       cl_program;
typedef struct _cl_kernel*        cl_kernel;
typedef struct _cl_mem*           cl_mem;
typedef unsigned int              cl_uint;
typedef int                       cl_int;
typedef unsigned long long        cl_ulong;
typedef size_t                    cl_bool;

#define CL_SUCCESS 0
#define CL_DEVICE_TYPE_CPU (1ULL << 1)
#define CL_DEVICE_TYPE_GPU (1ULL << 2)
#define CL_MEM_READ_ONLY   (1ULL << 2)
#define CL_MEM_WRITE_ONLY  (1ULL << 1)
#define CL_MEM_READ_WRITE  (1ULL << 0)
#define CL_TRUE  1
#define CL_FALSE 0
#define CL_PROGRAM_BUILD_LOG 0x1183

class OpenCLBackend {
public:
  HMODULE hOpenCL;
  int ready;

  cl_platform_id platform;
  cl_device_id device;
  cl_context context;
  cl_command_queue queue;
  cl_program program;
  cl_kernel kCorr;

  cl_mem bufFeat;
  cl_mem bufCorr;

  int featBytes;
  int corrBytes;

  cl_int (*clGetPlatformIDs)(cl_uint, cl_platform_id*, cl_uint*);
  cl_int (*clGetDeviceIDs)(cl_platform_id, cl_ulong, cl_uint, cl_device_id*, cl_uint*);
  cl_context (*clCreateContext)(void*, cl_uint, const cl_device_id*, void*, void*, cl_int*);
  cl_command_queue (*clCreateCommandQueue)(cl_context, cl_device_id, cl_ulong, cl_int*);
  cl_program (*clCreateProgramWithSource)(cl_context, cl_uint, const char**, const size_t*, cl_int*);
  cl_int (*clBuildProgram)(cl_program, cl_uint, const cl_device_id*, const char*, void*, void*);
  cl_int (*clGetProgramBuildInfo)(cl_program, cl_device_id, cl_uint, size_t, void*, size_t*);
  cl_kernel (*clCreateKernel)(cl_program, const char*, cl_int*);
  cl_int (*clSetKernelArg)(cl_kernel, cl_uint, size_t, const void*);
  cl_mem (*clCreateBuffer)(cl_context, cl_ulong, size_t, void*, cl_int*);
  cl_int (*clEnqueueWriteBuffer)(cl_command_queue, cl_mem, cl_bool, size_t, size_t, const void*, cl_uint, const void*, void*);
  cl_int (*clEnqueueReadBuffer)(cl_command_queue, cl_mem, cl_bool, size_t, size_t, void*, cl_uint, const void*, void*);
  cl_int (*clEnqueueNDRangeKernel)(cl_command_queue, cl_kernel, cl_uint, const size_t*, const size_t*, const size_t*, cl_uint, const void*, void*);
  cl_int (*clFinish)(cl_command_queue);
  cl_int (*clReleaseMemObject)(cl_mem);
  cl_int (*clReleaseKernel)(cl_kernel);
  cl_int (*clReleaseProgram)(cl_program);
  cl_int (*clReleaseCommandQueue)(cl_command_queue);
  cl_int (*clReleaseContext)(cl_context);

  OpenCLBackend()
  : hOpenCL(NULL), ready(0),
    platform(NULL), device(NULL), context(NULL), queue(NULL), program(NULL), kCorr(NULL),
    bufFeat(NULL), bufCorr(NULL),
    featBytes(0), corrBytes(0),
    clGetPlatformIDs(NULL), clGetDeviceIDs(NULL), clCreateContext(NULL), clCreateCommandQueue(NULL),
    clCreateProgramWithSource(NULL), clBuildProgram(NULL), clGetProgramBuildInfo(NULL),
    clCreateKernel(NULL), clSetKernelArg(NULL),
    clCreateBuffer(NULL), clEnqueueWriteBuffer(NULL), clEnqueueReadBuffer(NULL),
    clEnqueueNDRangeKernel(NULL), clFinish(NULL),
    clReleaseMemObject(NULL), clReleaseKernel(NULL), clReleaseProgram(NULL),
    clReleaseCommandQueue(NULL), clReleaseContext(NULL)
  {}

  int loadSymbol(void** fp, const char* name) {
    *fp = (void*)GetProcAddress(hOpenCL, name);
    return (*fp != NULL);
  }

  const char* kernelSource() {
    return
      "__kernel void corr_pairwise(\n"
      "  __global const float* feat,\n"
      "  __global float* outCorr,\n"
      "  const int nAssets,\n"
      "  const int nFeat,\n"
      "  const int windowSize,\n"
      "  const float eps\n"
      "){\n"
      "  int a = (int)get_global_id(0);\n"
      "  int b = (int)get_global_id(1);\n"
      "  if(a >= nAssets || b >= nAssets) return;\n"
      "  if(a >= b) return;\n"
      "  float acc = 0.0f;\n"
      "  for(int f=0; f<nFeat; f++){\n"
      "    int baseA = (f*nAssets + a) * windowSize;\n"
      "    int baseB = (f*nAssets + b) * windowSize;\n"
      "    float mx = 0.0f;\n"
      "    float my = 0.0f;\n"
      "    for(int t=0; t<windowSize; t++){\n"
      "      mx += feat[baseA + t];\n"
      "      my += feat[baseB + t];\n"
      "    }\n"
      "    mx /= (float)windowSize;\n"
      "    my /= (float)windowSize;\n"
      "    float sxx = 0.0f;\n"
      "    float syy = 0.0f;\n"
      "    float sxy = 0.0f;\n"
      "    for(int t=0; t<windowSize; t++){\n"
      "      float dx = feat[baseA + t] - mx;\n"
      "      float dy = feat[baseB + t] - my;\n"
      "      sxx += dx*dx;\n"
      "      syy += dy*dy;\n"
      "      sxy += dx*dy;\n"
      "    }\n"
      "    float den = sqrt(sxx*syy + eps);\n"
      "    float corr = (den > eps) ? (sxy/den) : 0.0f;\n"
      "    acc += corr;\n"
      "  }\n"
      "  outCorr[a*nAssets + b] = acc / (float)nFeat;\n"
      "}\n";
  }

  void printBuildLog() {
    if(!clGetProgramBuildInfo || !program || !device) return;
    size_t logSize = 0;
    clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG, 0, NULL, &logSize);
    if(logSize == 0) return;
    char* log = (char*)malloc(logSize + 1);
    if(!log) return;
    memset(log, 0, logSize + 1);
    clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG, logSize, log, NULL);
    printf("OpenCL build log:\n%s\n", log);
    free(log);
  }

  void init() {
    ready = 0;

    hOpenCL = LoadLibraryA("OpenCL.dll");
    if(!hOpenCL) {
      printf("OpenCL: CPU (OpenCL.dll missing)\n");
      return;
    }

    if(!loadSymbol((void**)&clGetPlatformIDs,       "clGetPlatformIDs")) return;
    if(!loadSymbol((void**)&clGetDeviceIDs,         "clGetDeviceIDs")) return;
    if(!loadSymbol((void**)&clCreateContext,        "clCreateContext")) return;
    if(!loadSymbol((void**)&clCreateCommandQueue,   "clCreateCommandQueue")) return;
    if(!loadSymbol((void**)&clCreateProgramWithSource,"clCreateProgramWithSource")) return;
    if(!loadSymbol((void**)&clBuildProgram,         "clBuildProgram")) return;
    if(!loadSymbol((void**)&clGetProgramBuildInfo,  "clGetProgramBuildInfo")) return;
    if(!loadSymbol((void**)&clCreateKernel,         "clCreateKernel")) return;
    if(!loadSymbol((void**)&clSetKernelArg,         "clSetKernelArg")) return;
    if(!loadSymbol((void**)&clCreateBuffer,         "clCreateBuffer")) return;
    if(!loadSymbol((void**)&clEnqueueWriteBuffer,   "clEnqueueWriteBuffer")) return;
    if(!loadSymbol((void**)&clEnqueueReadBuffer,    "clEnqueueReadBuffer")) return;
    if(!loadSymbol((void**)&clEnqueueNDRangeKernel, "clEnqueueNDRangeKernel")) return;
    if(!loadSymbol((void**)&clFinish,               "clFinish")) return;
    if(!loadSymbol((void**)&clReleaseMemObject,     "clReleaseMemObject")) return;
    if(!loadSymbol((void**)&clReleaseKernel,        "clReleaseKernel")) return;
    if(!loadSymbol((void**)&clReleaseProgram,       "clReleaseProgram")) return;
    if(!loadSymbol((void**)&clReleaseCommandQueue,  "clReleaseCommandQueue")) return;
    if(!loadSymbol((void**)&clReleaseContext,       "clReleaseContext")) return;

    cl_uint nPlat = 0;
    if(clGetPlatformIDs(0, NULL, &nPlat) != CL_SUCCESS || nPlat == 0) {
      printf("OpenCL: CPU (no platform)\n");
      return;
    }
    clGetPlatformIDs(1, &platform, NULL);

    cl_uint nDev = 0;
    cl_int ok = clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, &nDev);
    if(ok != CL_SUCCESS || nDev == 0) {
      ok = clGetDeviceIDs(platform, CL_DEVICE_TYPE_CPU, 1, &device, &nDev);
      if(ok != CL_SUCCESS || nDev == 0) {
        printf("OpenCL: CPU (no device)\n");
        return;
      }
    }

    cl_int err = 0;
    context = clCreateContext(NULL, 1, &device, NULL, NULL, &err);
    if(err != CL_SUCCESS || !context) {
      printf("OpenCL: CPU (context fail)\n");
      return;
    }

    queue = clCreateCommandQueue(context, device, 0, &err);
    if(err != CL_SUCCESS || !queue) {
      printf("OpenCL: CPU (queue fail)\n");
      return;
    }

    const char* src = kernelSource();
    program = clCreateProgramWithSource(context, 1, &src, NULL, &err);
    if(err != CL_SUCCESS || !program) {
      printf("OpenCL: CPU (program fail)\n");
      return;
    }

    err = clBuildProgram(program, 1, &device, "", NULL, NULL);
    if(err != CL_SUCCESS) {
      printf("OpenCL: CPU (build fail)\n");
      printBuildLog();
      return;
    }

    kCorr = clCreateKernel(program, "corr_pairwise", &err);
    if(err != CL_SUCCESS || !kCorr) {
      printf("OpenCL: CPU (kernel fail)\n");
      printBuildLog();
      return;
    }

    featBytes = FEAT_N * N_ASSETS * FEAT_WINDOW * (int)sizeof(float);
    corrBytes = N_ASSETS * N_ASSETS * (int)sizeof(float);

    bufFeat = clCreateBuffer(context, CL_MEM_READ_ONLY, (size_t)featBytes, NULL, &err);
    if(err != CL_SUCCESS || !bufFeat) {
      printf("OpenCL: CPU (bufFeat fail)\n");
      return;
    }

    bufCorr = clCreateBuffer(context, CL_MEM_WRITE_ONLY, (size_t)corrBytes, NULL, &err);
    if(err != CL_SUCCESS || !bufCorr) {
      printf("OpenCL: CPU (bufCorr fail)\n");
      return;
    }

    ready = 1;
    printf("OpenCL: READY (kernel+buffers)\n");
  }

  void shutdown() {
    if(bufCorr) { clReleaseMemObject(bufCorr); bufCorr = NULL; }
    if(bufFeat) { clReleaseMemObject(bufFeat); bufFeat = NULL; }
    if(kCorr) { clReleaseKernel(kCorr); kCorr = NULL; }
    if(program) { clReleaseProgram(program); program = NULL; }
    if(queue) { clReleaseCommandQueue(queue); queue = NULL; }
    if(context) { clReleaseContext(context); context = NULL; }
    if(hOpenCL) { FreeLibrary(hOpenCL); hOpenCL = NULL; }
    ready = 0;
  }

  int computeCorrelationMatrixCL(const float* featLinear, float* outCorr, int nAssets, int nFeat, int windowSize) {
    if(!ready) return 0;
    if(!featLinear || !outCorr) return 0;

    cl_int err = clEnqueueWriteBuffer(queue, bufFeat, CL_TRUE, 0, (size_t)featBytes, featLinear, 0, NULL, NULL);
    if(err != CL_SUCCESS) return 0;

    float eps = 1e-12f;
    err = CL_SUCCESS;
    err |= clSetKernelArg(kCorr, 0, sizeof(cl_mem), &bufFeat);
    err |= clSetKernelArg(kCorr, 1, sizeof(cl_mem), &bufCorr);
    err |= clSetKernelArg(kCorr, 2, sizeof(int), &nAssets);
    err |= clSetKernelArg(kCorr, 3, sizeof(int), &nFeat);
    err |= clSetKernelArg(kCorr, 4, sizeof(int), &windowSize);
    err |= clSetKernelArg(kCorr, 5, sizeof(float), &eps);
    if(err != CL_SUCCESS) return 0;

    size_t global[2];
    global[0] = (size_t)nAssets;
    global[1] = (size_t)nAssets;

    err = clEnqueueNDRangeKernel(queue, kCorr, 2, NULL, global, NULL, 0, NULL, NULL);
    if(err != CL_SUCCESS) return 0;

    err = clFinish(queue);
    if(err != CL_SUCCESS) return 0;

    err = clEnqueueReadBuffer(queue, bufCorr, CL_TRUE, 0, (size_t)corrBytes, outCorr, 0, NULL, NULL);
    if(err != CL_SUCCESS) return 0;

    return 1;
  }
};

// ---------------------------- Learning Layer ----------------------------

struct LearningSnapshot {
  double meanScore;
  double meanCompactness;
  double meanVol;
  int regime;
  double regimeConfidence;
};

class UnsupervisedModel {
public:
  double centroids[3][3];
  int counts[3];
  int initialized;
  UnsupervisedModel() : initialized(0) { memset(centroids, 0, sizeof(centroids)); memset(counts, 0, sizeof(counts)); }
  void init() { initialized = 0; memset(centroids, 0, sizeof(centroids)); memset(counts, 0, sizeof(counts)); }
  void update(const LearningSnapshot& s, int* regimeOut, double* confOut) {
    double x0=s.meanScore,x1=s.meanCompactness,x2=s.meanVol;
    if(!initialized) {
      for(int k=0;k<3;k++){ centroids[k][0]=x0+0.01*(k-1); centroids[k][1]=x1+0.01*(1-k); centroids[k][2]=x2+0.005*(k-1); counts[k]=1; }
      initialized = 1;
    }
    int best=0; double bestDist=INF, secondDist=INF;
    for(int k=0;k<3;k++) {
      double d0=x0-centroids[k][0], d1=x1-centroids[k][1], d2=x2-centroids[k][2];
      double dist=d0*d0+d1*d1+d2*d2;
      if(dist < bestDist){ secondDist=bestDist; bestDist=dist; best=k; }
      else if(dist < secondDist){ secondDist=dist; }
    }
    counts[best]++;
    double lr = 1.0/(double)counts[best];
    centroids[best][0] += lr*(x0-centroids[best][0]);
    centroids[best][1] += lr*(x1-centroids[best][1]);
    centroids[best][2] += lr*(x2-centroids[best][2]);
    *regimeOut = best;
    *confOut = 1.0/(1.0 + sqrt(fabs(secondDist-bestDist)+EPS));
  }
};

class RLAgent {
public:
  double q[4]; int n[4]; int lastAction; double lastMeanScore;
  RLAgent() : lastAction(0), lastMeanScore(0) { for(int i=0;i<4;i++){q[i]=0;n[i]=0;} }
  void init(){ lastAction=0; lastMeanScore=0; for(int i=0;i<4;i++){q[i]=0;n[i]=0;} }
  int chooseAction(int updateCount){ if((updateCount%10)==0) return updateCount%4; int b=0; for(int i=1;i<4;i++) if(q[i]>q[b]) b=i; return b; }
  void updateReward(double newMeanScore){ double r=newMeanScore-lastMeanScore; n[lastAction]++; q[lastAction]+=(r-q[lastAction])/(double)n[lastAction]; lastMeanScore=newMeanScore; }
};

class PCAModel {
public:
  double hist[PCA_WINDOW][PCA_DIM];
  double mean[PCA_DIM];
  double stdev[PCA_DIM];
  double latent[PCA_COMP];
  double explainedVar[PCA_COMP];
  int writeIdx;
  int count;
  int rebuildEvery;
  int updates;
  double dom;
  double rot;
  double prevExplained0;

  PCAModel() : writeIdx(0), count(0), rebuildEvery(PCA_REBUILD_EVERY), updates(0), dom(0), rot(0), prevExplained0(0) {
    memset(hist, 0, sizeof(hist));
    memset(mean, 0, sizeof(mean));
    memset(stdev, 0, sizeof(stdev));
    memset(latent, 0, sizeof(latent));
    memset(explainedVar, 0, sizeof(explainedVar));
  }

  void init() {
    writeIdx = 0;
    count = 0;
    updates = 0;
    dom = 0;
    rot = 0;
    prevExplained0 = 0;
    memset(hist, 0, sizeof(hist));
    memset(mean, 0, sizeof(mean));
    memset(stdev, 0, sizeof(stdev));
    memset(latent, 0, sizeof(latent));
    memset(explainedVar, 0, sizeof(explainedVar));
  }

  void pushSnapshot(const double x[PCA_DIM]) {
    for(int d=0; d<PCA_DIM; d++) hist[writeIdx][d] = x[d];
    writeIdx = (writeIdx + 1) % PCA_WINDOW;
    if(count < PCA_WINDOW) count++;
  }

  void rebuildStats() {
    if(count <= 0) return;
    for(int d=0; d<PCA_DIM; d++) {
      double m = 0;
      for(int i=0; i<count; i++) m += hist[i][d];
      m /= (double)count;
      mean[d] = m;

      double v = 0;
      for(int i=0; i<count; i++) {
        double dd = hist[i][d] - m;
        v += dd * dd;
      }
      v /= (double)count;
      stdev[d] = sqrt(v + EPS);
    }
  }

  void update(const LearningSnapshot& snap, int regime, double conf) {
    double x[PCA_DIM];
    x[0] = snap.meanScore;
    x[1] = snap.meanCompactness;
    x[2] = snap.meanVol;
    x[3] = (double)regime / 2.0;
    x[4] = conf;
    x[5] = snap.meanScore - snap.meanCompactness;

    pushSnapshot(x);
    updates++;
    if((updates % rebuildEvery) == 0 || count < 4) rebuildStats();

    double z[PCA_DIM];
    for(int d=0; d<PCA_DIM; d++) z[d] = (x[d] - mean[d]) / (stdev[d] + EPS);

    latent[0] = 0.60*z[0] + 0.30*z[1] + 0.10*z[2];
    latent[1] = 0.25*z[0] - 0.45*z[1] + 0.20*z[2] + 0.10*z[4];
    latent[2] = 0.20*z[2] + 0.50*z[3] - 0.30*z[5];

    double a0 = fabs(latent[0]);
    double a1 = fabs(latent[1]);
    double a2 = fabs(latent[2]);
    double sumA = a0 + a1 + a2 + EPS;

    explainedVar[0] = a0 / sumA;
    explainedVar[1] = a1 / sumA;
    explainedVar[2] = a2 / sumA;

    dom = explainedVar[0];
    rot = fabs(explainedVar[0] - prevExplained0);
    prevExplained0 = explainedVar[0];
  }
};

class GMMRegimeModel {
public:
  double pi[GMM_K];
  double mu[GMM_K][GMM_DIM];
  double var[GMM_K][GMM_DIM];
  double p[GMM_K];
  double entropy;
  double conf;
  int bestRegime;
  int initialized;

  GMMRegimeModel() : entropy(0), conf(0), bestRegime(0), initialized(0) {
    memset(pi, 0, sizeof(pi));
    memset(mu, 0, sizeof(mu));
    memset(var, 0, sizeof(var));
    memset(p, 0, sizeof(p));
  }

  void init() {
    initialized = 0;
    entropy = 0;
    conf = 0;
    bestRegime = 0;
    for(int k=0;k<GMM_K;k++) {
      pi[k] = 1.0 / (double)GMM_K;
      for(int d=0; d<GMM_DIM; d++) {
        mu[k][d] = 0.02 * (k - 1);
        var[k][d] = 1.0;
      }
      p[k] = 1.0 / (double)GMM_K;
    }
    initialized = 1;
  }

  static double gaussianDiag(const double* x, const double* m, const double* v) {
    double logp = 0;
    for(int d=0; d<GMM_DIM; d++) {
      double vv = v[d];
      if(vv < GMM_VAR_FLOOR) vv = GMM_VAR_FLOOR;
      double z = x[d] - m[d];
      logp += -0.5 * (z*z / vv + log(vv + EPS));
    }
    if(logp < -80.0) logp = -80.0;
    return exp(logp);
  }

  void infer(const double x[GMM_DIM]) {
    if(!initialized) init();
    double sum = 0;
    for(int k=0;k<GMM_K;k++) {
      double g = gaussianDiag(x, mu[k], var[k]);
      p[k] = pi[k] * g;
      sum += p[k];
    }
    if(sum < EPS) {
      for(int k=0;k<GMM_K;k++) p[k] = 1.0 / (double)GMM_K;
    } else {
      for(int k=0;k<GMM_K;k++) p[k] /= sum;
    }

    bestRegime = 0;
    conf = p[0];
    for(int k=1;k<GMM_K;k++) {
      if(p[k] > conf) {
        conf = p[k];
        bestRegime = k;
      }
    }

    entropy = 0;
    for(int k=0;k<GMM_K;k++) entropy -= p[k] * log(p[k] + EPS);

#if GMM_ONLINE_UPDATE
    // lightweight incremental update (EM-like with forgetting)
    for(int k=0;k<GMM_K;k++) {
      double w = GMM_ALPHA * p[k];
      pi[k] = (1.0 - GMM_ALPHA) * pi[k] + w;
      for(int d=0; d<GMM_DIM; d++) {
        double diff = x[d] - mu[k][d];
        mu[k][d] += w * diff;
        var[k][d] = (1.0 - w) * var[k][d] + w * diff * diff;
        if(var[k][d] < GMM_VAR_FLOOR) var[k][d] = GMM_VAR_FLOOR;
      }
    }
#endif
  }
};


class HMMRegimeModel {
public:
  double A[HMM_K][HMM_K];
  double mu[HMM_K][HMM_DIM];
  double var[HMM_K][HMM_DIM];
  double posterior[HMM_K];
  double entropy;
  double conf;
  double switchProb;
  int regime;
  int initialized;

  HMMRegimeModel() : entropy(0), conf(0), switchProb(0), regime(0), initialized(0) {
    memset(A, 0, sizeof(A));
    memset(mu, 0, sizeof(mu));
    memset(var, 0, sizeof(var));
    memset(posterior, 0, sizeof(posterior));
  }

  void init() {
    for(int i=0;i<HMM_K;i++) {
      for(int j=0;j<HMM_K;j++) A[i][j] = (i==j) ? 0.90 : 0.10/(double)(HMM_K-1);
      for(int d=0; d<HMM_DIM; d++) {
        mu[i][d] = 0.03 * (i - 1);
        var[i][d] = 1.0;
      }
      posterior[i] = 1.0/(double)HMM_K;
    }
    regime = 0;
    conf = posterior[0];
    entropy = 0;
    switchProb = 0;
    initialized = 1;
  }

  static double emissionDiag(const double* x, const double* m, const double* v) {
    double logp = 0;
    for(int d=0; d<HMM_DIM; d++) {
      double vv = v[d];
      if(vv < HMM_VAR_FLOOR) vv = HMM_VAR_FLOOR;
      double z = x[d] - m[d];
      logp += -0.5 * (z*z / vv + log(vv + EPS));
    }
    if(logp < -80.0) logp = -80.0;
    return exp(logp);
  }

  void filter(const double obs[HMM_DIM]) {
    if(!initialized) init();

    double pred[HMM_K];
    for(int j=0;j<HMM_K;j++) {
      pred[j] = 0;
      for(int i=0;i<HMM_K;i++) pred[j] += posterior[i] * A[i][j];
    }

    double alpha[HMM_K];
    double sum = 0;
    for(int k=0;k<HMM_K;k++) {
      double emit = emissionDiag(obs, mu[k], var[k]);
      alpha[k] = pred[k] * emit;
      sum += alpha[k];
    }
    if(sum < EPS) {
      for(int k=0;k<HMM_K;k++) alpha[k] = 1.0/(double)HMM_K;
    } else {
      for(int k=0;k<HMM_K;k++) alpha[k] /= sum;
    }

    for(int k=0;k<HMM_K;k++) posterior[k] = alpha[k];

    regime = 0;
    conf = posterior[0];
    for(int k=1;k<HMM_K;k++) if(posterior[k] > conf) { conf = posterior[k]; regime = k; }

    entropy = 0;
    for(int k=0;k<HMM_K;k++) entropy -= posterior[k] * log(posterior[k] + EPS);

    switchProb = 1.0 - A[regime][regime];
    if(switchProb < 0) switchProb = 0;
    if(switchProb > 1) switchProb = 1;

#if HMM_ONLINE_UPDATE
    for(int k=0;k<HMM_K;k++) {
      double w = HMM_SMOOTH * posterior[k];
      for(int d=0; d<HMM_DIM; d++) {
        double diff = obs[d] - mu[k][d];
        mu[k][d] += w * diff;
        var[k][d] = (1.0 - w) * var[k][d] + w * diff * diff;
        if(var[k][d] < HMM_VAR_FLOOR) var[k][d] = HMM_VAR_FLOOR;
      }
    }
#endif
  }
};

class KMeansRegimeModel {
public:
  double centroids[KMEANS_K][KMEANS_DIM];
  double distEma;
  double distVarEma;
  int initialized;
  int regime;
  double dist;
  double stability;

  KMeansRegimeModel() : distEma(0), distVarEma(1), initialized(0), regime(0), dist(0), stability(0) {
    memset(centroids, 0, sizeof(centroids));
  }

  void init() {
    distEma = 0;
    distVarEma = 1;
    initialized = 0;
    regime = 0;
    dist = 0;
    stability = 0;
    memset(centroids, 0, sizeof(centroids));
  }

  void seed(const double x[KMEANS_DIM]) {
    for(int k=0;k<KMEANS_K;k++) {
      for(int d=0; d<KMEANS_DIM; d++) {
        centroids[k][d] = x[d] + 0.03 * (k - 1);
      }
    }
    initialized = 1;
  }

  static double clampRange(double x, double lo, double hi) {
    if(x < lo) return lo;
    if(x > hi) return hi;
    return x;
  }

  void predictAndUpdate(const double x[KMEANS_DIM]) {
    if(!initialized) seed(x);

    int best = 0;
    double bestDist = INF;
    for(int k=0;k<KMEANS_K;k++) {
      double s = 0;
      for(int d=0; d<KMEANS_DIM; d++) {
        double z = x[d] - centroids[k][d];
        s += z * z;
      }
      double dk = sqrt(s + EPS);
      if(dk < bestDist) {
        bestDist = dk;
        best = k;
      }
    }

    regime = best;
    dist = bestDist;

    distEma = (1.0 - KMEANS_DIST_EMA) * distEma + KMEANS_DIST_EMA * dist;
    double dd = dist - distEma;
    distVarEma = (1.0 - KMEANS_DIST_EMA) * distVarEma + KMEANS_DIST_EMA * dd * dd;
    double distStd = sqrt(distVarEma + EPS);
    double zDist = (dist - distEma) / (distStd + EPS);
    stability = clampRange(1.0 / (1.0 + exp(zDist)), 0.0, 1.0);

#if KMEANS_ONLINE_UPDATE
    for(int d=0; d<KMEANS_DIM; d++) {
      centroids[best][d] += KMEANS_ETA * (x[d] - centroids[best][d]);
    }
#endif
  }
};


class SpectralClusterModel {
public:
  int clusterId[N_ASSETS];
  int nClusters;

  void init() {
    nClusters = SPECTRAL_K;
    for(int i=0;i<N_ASSETS;i++) clusterId[i] = i % SPECTRAL_K;
  }

  void update(const fvar* distMatrix) {
    if(!distMatrix) return;
    // lightweight deterministic clustering surrogate from distance rows
    for(int i=0;i<N_ASSETS;i++) {
      double sig = 0;
      for(int j=0;j<N_ASSETS;j++) {
        if(i == j) continue;
        double d = (double)distMatrix[i*N_ASSETS + j];
        if(d < INF) sig += d;
      }
      int cid = (int)fmod(fabs(sig * 1000.0), (double)SPECTRAL_K);
      if(cid < 0) cid = 0;
      if(cid >= SPECTRAL_K) cid = SPECTRAL_K - 1;
      clusterId[i] = cid;
    }
  }
};


class HierarchicalClusteringModel {
public:
  int clusterCoarse[N_ASSETS];
  int clusterFine[N_ASSETS];
  int nCoarse;
  int nFine;

  int leftChild[2*N_ASSETS];
  int rightChild[2*N_ASSETS];
  int nodeSize[2*N_ASSETS];
  double nodeHeight[2*N_ASSETS];
  double nodeDist[2*N_ASSETS][2*N_ASSETS];
  int rootNode;

  void init() {
    nCoarse = HCLUST_COARSE_K;
    nFine = HCLUST_FINE_K;
    rootNode = N_ASSETS - 1;
    for(int i=0;i<N_ASSETS;i++) {
      clusterCoarse[i] = i % HCLUST_COARSE_K;
      clusterFine[i] = i % HCLUST_FINE_K;
    }
  }

  void collectLeaves(int node, int clusterId, int* out) {
    int stack[2*N_ASSETS];
    int sp = 0;
    stack[sp++] = node;
    while(sp > 0) {
      int cur = stack[--sp];
      if(cur < N_ASSETS) {
        out[cur] = clusterId;
      } else {
        if(leftChild[cur] >= 0) stack[sp++] = leftChild[cur];
        if(rightChild[cur] >= 0) stack[sp++] = rightChild[cur];
      }
    }
  }

  void cutByK(int K, int* out) {
    for(int i=0;i<N_ASSETS;i++) out[i] = -1;
    if(K <= 1) {
      for(int i=0;i<N_ASSETS;i++) out[i] = 0;
      return;
    }

    int clusters[2*N_ASSETS];
    int count = 1;
    clusters[0] = rootNode;

    while(count < K) {
      int bestPos = -1;
      double bestHeight = -1;
      for(int i=0;i<count;i++) {
        int node = clusters[i];
        if(node >= N_ASSETS && nodeHeight[node] > bestHeight) {
          bestHeight = nodeHeight[node];
          bestPos = i;
        }
      }
      if(bestPos < 0) break;
      int node = clusters[bestPos];
      int l = leftChild[node];
      int r = rightChild[node];
      clusters[bestPos] = l;
      clusters[count++] = r;
    }

    for(int c=0;c<count;c++) {
      collectLeaves(clusters[c], c, out);
    }
    for(int i=0;i<N_ASSETS;i++) if(out[i] < 0) out[i] = 0;
  }

  void update(const fvar* distMatrix) {
    if(!distMatrix) return;

    int totalNodes = 2 * N_ASSETS;
    for(int i=0;i<totalNodes;i++) {
      leftChild[i] = -1;
      rightChild[i] = -1;
      nodeSize[i] = (i < N_ASSETS) ? 1 : 0;
      nodeHeight[i] = 0;
      for(int j=0;j<totalNodes;j++) nodeDist[i][j] = INF;
    }

    for(int i=0;i<N_ASSETS;i++) {
      for(int j=0;j<N_ASSETS;j++) {
        if(i == j) nodeDist[i][j] = 0;
        else {
          double d = (double)distMatrix[i*N_ASSETS + j];
          if(d < 0 || d >= INF) d = 1.0;
          nodeDist[i][j] = d;
        }
      }
    }

    int active[2*N_ASSETS];
    int nActive = N_ASSETS;
    for(int i=0;i<N_ASSETS;i++) active[i] = i;
    int nextNode = N_ASSETS;

    while(nActive > 1 && nextNode < 2*N_ASSETS) {
      int ai = 0, aj = 1;
      double best = INF;
      for(int i=0;i<nActive;i++) {
        for(int j=i+1;j<nActive;j++) {
          int a = active[i], b = active[j];
          if(nodeDist[a][b] < best) {
            best = nodeDist[a][b];
            ai = i; aj = j;
          }
        }
      }

      int a = active[ai];
      int b = active[aj];
      int m = nextNode++;

      leftChild[m] = a;
      rightChild[m] = b;
      nodeHeight[m] = best;
      nodeSize[m] = nodeSize[a] + nodeSize[b];

      for(int i=0;i<nActive;i++) {
        if(i == ai || i == aj) continue;
        int k = active[i];
        double da = nodeDist[a][k];
        double db = nodeDist[b][k];
        double dm = (nodeSize[a] * da + nodeSize[b] * db) / (double)(nodeSize[a] + nodeSize[b]);
        nodeDist[m][k] = dm;
        nodeDist[k][m] = dm;
      }
      nodeDist[m][m] = 0;

      if(aj < ai) { int t=ai; ai=aj; aj=t; }
      for(int i=aj;i<nActive-1;i++) active[i] = active[i+1];
      nActive--;
      for(int i=ai;i<nActive-1;i++) active[i] = active[i+1];
      nActive--;
      active[nActive++] = m;
    }

    rootNode = active[0];

    int kc = HCLUST_COARSE_K;
    if(kc < 1) kc = 1;
    if(kc > N_ASSETS) kc = N_ASSETS;
    int kf = HCLUST_FINE_K;
    if(kf < 1) kf = 1;
    if(kf > N_ASSETS) kf = N_ASSETS;

    cutByK(kc, clusterCoarse);
    cutByK(kf, clusterFine);
    nCoarse = kc;
    nFine = kf;
  }
};


class CommunityDetectionModel {
public:
  int communityId[N_ASSETS];
  int clusterCoarse[N_ASSETS];
  int clusterFine[N_ASSETS];
  int nCommunities;
  fvar modularityQ;
  fvar qSmooth;

  void init() {
    nCommunities = 1;
    modularityQ = 0;
    qSmooth = 0;
    for(int i=0;i<N_ASSETS;i++) {
      communityId[i] = 0;
      clusterCoarse[i] = i % HCLUST_COARSE_K;
      clusterFine[i] = i % HCLUST_FINE_K;
    }
  }

  static int argmaxLabel(const fvar w[N_ASSETS], const int label[N_ASSETS], int node) {
    fvar acc[N_ASSETS];
    for(int i=0;i<N_ASSETS;i++) acc[i] = 0;
    for(int j=0;j<N_ASSETS;j++) {
      if(j == node) continue;
      int l = label[j];
      if(l < 0 || l >= N_ASSETS) continue;
      acc[l] += w[j];
    }
    int best = label[node];
    fvar bestV = -1;
    for(int l=0;l<N_ASSETS;l++) {
      if(acc[l] > bestV) { bestV = acc[l]; best = l; }
    }
    return best;
  }

  void update(const fvar* corrMatrix, const fvar* distMatrix) {
    if(!corrMatrix || !distMatrix) return;

    fvar W[N_ASSETS][N_ASSETS];
    fvar degree[N_ASSETS];
    int label[N_ASSETS];

    for(int i=0;i<N_ASSETS;i++) {
      degree[i] = 0;
      label[i] = i;
      for(int j=0;j<N_ASSETS;j++) {
        if(i == j) W[i][j] = 0;
        else {
          fvar w = (fvar)fabs((double)corrMatrix[i*N_ASSETS + j]);
          if(w < (fvar)COMM_W_MIN) w = 0;
          W[i][j] = w;
          degree[i] += w;
        }
      }
    }

    // Optional top-M pruning for determinism/noise control
    for(int i=0;i<N_ASSETS;i++) {
      int keep[N_ASSETS];
      for(int j=0;j<N_ASSETS;j++) keep[j] = 0;
      for(int k=0;k<COMM_TOPM;k++) {
        int best = -1;
        fvar bestW = 0;
        for(int j=0;j<N_ASSETS;j++) {
          if(i==j || keep[j]) continue;
          if(W[i][j] > bestW) { bestW = W[i][j]; best = j; }
        }
        if(best >= 0) keep[best] = 1;
      }
      for(int j=0;j<N_ASSETS;j++) if(i!=j && !keep[j]) W[i][j] = 0;
    }

    for(int it=0; it<COMM_ITERS; it++) {
      for(int i=0;i<N_ASSETS;i++) {
        label[i] = argmaxLabel(W[i], label, i);
      }
    }

    // compress labels
    int map[N_ASSETS];
    for(int i=0;i<N_ASSETS;i++) map[i] = -1;
    int nLab = 0;
    for(int i=0;i<N_ASSETS;i++) {
      int l = label[i];
      if(l < 0 || l >= N_ASSETS) l = 0;
      if(map[l] < 0) map[l] = nLab++;
      communityId[i] = map[l];
    }
    if(nLab < 1) nLab = 1;
    nCommunities = nLab;

    // modularity approximation
    fvar m2 = 0;
    for(int i=0;i<N_ASSETS;i++) for(int j=0;j<N_ASSETS;j++) m2 += W[i][j];
    if(m2 < (fvar)EPS) {
      modularityQ = 0;
    } else {
      fvar q = 0;
      for(int i=0;i<N_ASSETS;i++) {
        for(int j=0;j<N_ASSETS;j++) {
          if(communityId[i] == communityId[j]) {
            q += W[i][j] - (degree[i] * degree[j] / m2);
          }
        }
      }
      modularityQ = q / m2;
    }

    qSmooth = (fvar)(1.0 - COMM_Q_EMA) * qSmooth + (fvar)COMM_Q_EMA * modularityQ;

    for(int i=0;i<N_ASSETS;i++) {
      int c = communityId[i];
      if(c < 0) c = 0;
      clusterCoarse[i] = c % HCLUST_COARSE_K;
      clusterFine[i] = c % HCLUST_FINE_K;
    }
  }
};


class AutoencoderModel {
public:
  double mu[AE_INPUT_DIM];
  double sigma[AE_INPUT_DIM];
  double W1[AE_LATENT_DIM][AE_INPUT_DIM];
  double W2[AE_INPUT_DIM][AE_LATENT_DIM];
  int initialized;

  void init() {
    initialized = 1;
    for(int i=0;i<AE_INPUT_DIM;i++) {
      mu[i] = 0;
      sigma[i] = 1;
    }
    for(int z=0;z<AE_LATENT_DIM;z++) {
      for(int d=0;d<AE_INPUT_DIM;d++) {
        double w = sin((double)(z+1)*(d+1)) * 0.05;
        W1[z][d] = w;
        W2[d][z] = w;
      }
    }
  }

  static double act(double x) {
    if(x > 4) x = 4;
    if(x < -4) x = -4;
    return tanh(x);
  }

  double infer(const double xIn[AE_INPUT_DIM]) {
    if(!initialized) init();

    double x[AE_INPUT_DIM];
    for(int d=0;d<AE_INPUT_DIM;d++) x[d] = (xIn[d] - mu[d]) / (sigma[d] + EPS);

    double z[AE_LATENT_DIM];
    for(int k=0;k<AE_LATENT_DIM;k++) {
      double s = 0;
      for(int d=0;d<AE_INPUT_DIM;d++) s += W1[k][d] * x[d];
      z[k] = act(s);
    }

    double recon[AE_INPUT_DIM];
    for(int d=0;d<AE_INPUT_DIM;d++) {
      double s = 0;
      for(int k=0;k<AE_LATENT_DIM;k++) s += W2[d][k] * z[k];
      recon[d] = act(s);
    }

    double err = 0;
    for(int d=0;d<AE_INPUT_DIM;d++) {
      double e = x[d] - recon[d];
      err += e*e;
    }
    err /= (double)AE_INPUT_DIM;

    for(int d=0;d<AE_INPUT_DIM;d++) {
      mu[d] = (1.0 - AE_NORM_ALPHA) * mu[d] + AE_NORM_ALPHA * xIn[d];
      double dv = xIn[d] - mu[d];
      sigma[d] = (1.0 - AE_NORM_ALPHA) * sigma[d] + AE_NORM_ALPHA * sqrt(dv*dv + EPS);
      if(sigma[d] < 1e-5) sigma[d] = 1e-5;
    }
    return err;
  }
};

class NoveltyController {
public:
  double errEma;
  double errVar;
  double zRecon;
  int regime;
  double riskScale;

  void init() {
    errEma = 0;
    errVar = 1;
    zRecon = 0;
    regime = 0;
    riskScale = 1.0;
  }

  static double clampRange(double x, double lo, double hi) {
    if(x < lo) return lo;
    if(x > hi) return hi;
    return x;
  }

  void update(double reconError) {
    errEma = (1.0 - AE_ERR_EMA) * errEma + AE_ERR_EMA * reconError;
    double d = reconError - errEma;
    errVar = (1.0 - AE_ERR_EMA) * errVar + AE_ERR_EMA * d*d;
    double errStd = sqrt(errVar + EPS);
    zRecon = (reconError - errEma) / (errStd + EPS);

    if(zRecon >= AE_Z_HIGH) { regime = 2; riskScale = 0.20; }
    else if(zRecon >= AE_Z_LOW) { regime = 1; riskScale = 0.60; }
    else { regime = 0; riskScale = 1.00; }

    riskScale = clampRange(riskScale, 0.20, 1.00);
  }

  void apply(int* topK, double* scoreScale) {
    if(regime == 2) {
      if(*topK > 3) *topK -= 2;
      *scoreScale *= 0.60;
    } else if(regime == 1) {
      if(*topK > 3) *topK -= 1;
      *scoreScale *= 0.85;
    }
    if(*topK < 1) *topK = 1;
    if(*topK > TOP_K) *topK = TOP_K;
    *scoreScale = clampRange(*scoreScale, 0.10, 2.00);
  }
};


class SOMModel {
public:
  double W[SOM_H][SOM_W][SOM_DIM];
  int hitCount[SOM_H][SOM_W];
  int bmuX;
  int bmuY;
  double conf;
  int initialized;

  void init() {
    initialized = 1;
    bmuX = 0; bmuY = 0; conf = 0;
    for(int y=0;y<SOM_H;y++) {
      for(int x=0;x<SOM_W;x++) {
        hitCount[y][x] = 0;
        for(int d=0;d<SOM_DIM;d++) {
          W[y][x][d] = 0.02 * sin((double)(y+1)*(x+1)*(d+1));
        }
      }
    }
  }

  static double clampRange(double x,double lo,double hi){ if(x<lo) return lo; if(x>hi) return hi; return x; }

  void inferOrUpdate(const double s[SOM_DIM], int step) {
    if(!initialized) init();

    int bx=0, by=0;
    double best=INF, second=INF;
    for(int y=0;y<SOM_H;y++) {
      for(int x=0;x<SOM_W;x++) {
        double d2=0;
        for(int k=0;k<SOM_DIM;k++) {
          double z = s[k] - W[y][x][k];
          d2 += z*z;
        }
        if(d2 < best) { second = best; best=d2; bx=x; by=y; }
        else if(d2 < second) second = d2;
      }
    }

    bmuX = bx; bmuY = by;
    double d1 = sqrt(best + EPS);
    double d2 = sqrt(second + EPS);
    conf = clampRange((d2 - d1) / (d2 + EPS), 0.0, 1.0);
    hitCount[bmuY][bmuX]++;

#if SOM_ONLINE_UPDATE
    double alpha = SOM_ALPHA_MIN + (SOM_ALPHA_MAX - SOM_ALPHA_MIN) * exp(-0.005 * step);
    double sigma = SOM_SIGMA_MIN + (SOM_SIGMA_MAX - SOM_SIGMA_MIN) * exp(-0.005 * step);
    for(int y=0;y<SOM_H;y++) {
      for(int x=0;x<SOM_W;x++) {
        double gd2 = (double)((x-bmuX)*(x-bmuX) + (y-bmuY)*(y-bmuY));
        double h = exp(-gd2 / (2.0*sigma*sigma + EPS));
        for(int k=0;k<SOM_DIM;k++) {
          W[y][x][k] += alpha * h * (s[k] - W[y][x][k]);
        }
      }
    }
#endif
  }

  int regimeId() const { return bmuY * SOM_W + bmuX; }
};

class SOMPlaybook {
public:
  int region;
  double riskScale;

  void init() { region = 0; riskScale = 1.0; }

  void apply(const SOMModel& som, int* topK, double* scoreScale) {
    int mx = som.bmuX;
    int my = som.bmuY;
    int cx = (mx >= SOM_W/2) ? 1 : 0;
    int cy = (my >= SOM_H/2) ? 1 : 0;
    region = cy * 2 + cx;

    if(region == 0) { *scoreScale *= 1.02; riskScale = 1.00; }
    else if(region == 1) { *scoreScale *= 0.95; riskScale = 0.85; if(*topK > 3) (*topK)--; }
    else if(region == 2) { *scoreScale *= 0.90; riskScale = 0.70; if(*topK > 3) (*topK)--; }
    else { *scoreScale *= 0.80; riskScale = 0.50; if(*topK > 2) (*topK)-=2; }

    if(som.conf < SOM_CONF_MIN) {
      riskScale *= 0.8;
      if(*topK > 2) (*topK)--;
    }

    if(*topK < 1) *topK = 1;
    if(*topK > TOP_K) *topK = TOP_K;
    if(*scoreScale < 0.10) *scoreScale = 0.10;
    if(*scoreScale > 2.00) *scoreScale = 2.00;
  }
};

class StrategyController {
public:
  UnsupervisedModel unsup;
  RLAgent rl;
  PCAModel pca;
  GMMRegimeModel gmm;
  HMMRegimeModel hmm;
  KMeansRegimeModel kmeans;
  int dynamicTopK;
  double scoreScale;
  int regime;
  double adaptiveGamma;
  double adaptiveAlpha;
  double adaptiveBeta;
  double adaptiveLambda;
  double riskScale;
  int cooldown;

  StrategyController()
  : dynamicTopK(TOP_K), scoreScale(1.0), regime(0),
    adaptiveGamma(1.0), adaptiveAlpha(1.0), adaptiveBeta(1.0), adaptiveLambda(1.0), riskScale(1.0), cooldown(0) {}

  static double clampRange(double x, double lo, double hi) {
    if(x < lo) return lo;
    if(x > hi) return hi;
    return x;
  }

  void init() {
    unsup.init();
    rl.init();
    pca.init();
    gmm.init();
    hmm.init();
    kmeans.init();
    dynamicTopK = TOP_K;
    scoreScale = 1.0;
    regime = 0;
    adaptiveGamma = 1.0;
    adaptiveAlpha = 1.0;
    adaptiveBeta = 1.0;
    adaptiveLambda = 1.0;
    riskScale = 1.0;
    cooldown = 0;
  }

  void buildGMMState(const LearningSnapshot& snap, int reg, double conf, double x[GMM_DIM]) {
    x[0] = snap.meanScore;
    x[1] = snap.meanCompactness;
    x[2] = snap.meanVol;
    x[3] = pca.dom;
    x[4] = pca.rot;
    x[5] = (double)reg / 2.0;
    x[6] = conf;
    x[7] = snap.meanScore - snap.meanCompactness;
  }

  void buildHMMObs(const LearningSnapshot& snap, int reg, double conf, double x[HMM_DIM]) {
    x[0] = pca.latent[0];
    x[1] = pca.latent[1];
    x[2] = pca.latent[2];
    x[3] = snap.meanVol;
    x[4] = snap.meanScore;
    x[5] = snap.meanCompactness;
    x[6] = (double)reg / 2.0;
    x[7] = conf;
  }

  void buildKMeansState(const LearningSnapshot& snap, int reg, double conf, double x[KMEANS_DIM]) {
    x[0] = pca.latent[0];
    x[1] = pca.latent[1];
    x[2] = pca.latent[2];
    x[3] = snap.meanVol;
    x[4] = snap.meanScore;
    x[5] = snap.meanCompactness;
    x[6] = (double)reg / 2.0;
    x[7] = conf;
  }

  void onUpdate(const LearningSnapshot& snap, fvar* scores, int nScores, int updateCount) {
#if USE_ML
    double unsupConf = 0;
    unsup.update(snap, &regime, &unsupConf);
#if USE_PCA
    pca.update(snap, regime, unsupConf);
#else
    pca.dom = 0.5;
    pca.rot = 0.0;
#endif

#if USE_GMM
    double gx[GMM_DIM];
    buildGMMState(snap, regime, unsupConf, gx);
    gmm.infer(gx);
#if USE_HMM
    double hx[HMM_DIM];
    buildHMMObs(snap, regime, unsupConf, hx);
    hmm.filter(hx);
#if USE_KMEANS
    double kx[KMEANS_DIM];
    buildKMeansState(snap, regime, unsupConf, kx);
    kmeans.predictAndUpdate(kx);
#endif
#endif
    // regime presets: [gamma, alpha, beta, lambda]
    const double presets[GMM_K][4] = {
      {1.05, 1.00, 0.95, 1.00},
      {0.95, 1.05, 1.05, 0.95},
      {1.00, 0.95, 1.10, 1.05}
    };
    adaptiveGamma = 0;
    adaptiveAlpha = 0;
    adaptiveBeta  = 0;
    adaptiveLambda = 0;
    for(int k=0;k<GMM_K;k++) {
#if USE_HMM
      adaptiveGamma += hmm.posterior[k] * presets[k][0];
      adaptiveAlpha += hmm.posterior[k] * presets[k][1];
      adaptiveBeta  += hmm.posterior[k] * presets[k][2];
      adaptiveLambda += hmm.posterior[k] * presets[k][3];
#else
      adaptiveGamma += gmm.p[k] * presets[k][0];
      adaptiveAlpha += gmm.p[k] * presets[k][1];
      adaptiveBeta  += gmm.p[k] * presets[k][2];
      adaptiveLambda += gmm.p[k] * presets[k][3];
#endif
    }
#if USE_HMM
    double entNorm = hmm.entropy / log((double)HMM_K + EPS);
    riskScale = clampRange(1.0 - 0.45 * entNorm, HMM_MIN_RISK, 1.0);
    if(hmm.entropy > HMM_ENTROPY_TH || hmm.switchProb > HMM_SWITCH_TH) cooldown = HMM_COOLDOWN_UPDATES;
    else if(cooldown > 0) cooldown--;
#else
    double entNorm = gmm.entropy / log((double)GMM_K + EPS);
    riskScale = clampRange(1.0 - GMM_ENTROPY_COEFF * entNorm, GMM_MIN_RISK, 1.0);
#endif
#else
    adaptiveGamma = 1.0 + 0.35 * pca.dom - 0.25 * pca.rot;
    adaptiveAlpha = 1.0 + 0.30 * pca.dom;
    adaptiveBeta  = 1.0 + 0.25 * pca.rot;
    adaptiveLambda = 1.0 + 0.20 * pca.dom - 0.20 * pca.rot;
    riskScale = 1.0;
#endif

    adaptiveGamma = clampRange(adaptiveGamma, 0.80, 1.40);
    adaptiveAlpha = clampRange(adaptiveAlpha, 0.85, 1.35);
    adaptiveBeta  = clampRange(adaptiveBeta, 0.85, 1.35);
    adaptiveLambda = clampRange(adaptiveLambda, 0.85, 1.25);

#if USE_KMEANS
    const double kmPreset[KMEANS_K][4] = {
      {1.02, 1.00, 0.98, 1.00},
      {1.08, 0.96, 0.95, 1.02},
      {0.94, 1.08, 1.08, 0.92}
    };
    int kr = kmeans.regime;
    if(kr < 0) kr = 0;
    if(kr >= KMEANS_K) kr = KMEANS_K - 1;
    double wkm = clampRange(kmeans.stability, 0.0, 1.0);
    adaptiveGamma = (1.0 - wkm) * adaptiveGamma + wkm * kmPreset[kr][0];
    adaptiveAlpha = (1.0 - wkm) * adaptiveAlpha + wkm * kmPreset[kr][1];
    adaptiveBeta  = (1.0 - wkm) * adaptiveBeta  + wkm * kmPreset[kr][2];
    adaptiveLambda = (1.0 - wkm) * adaptiveLambda + wkm * kmPreset[kr][3];
    if(kmeans.stability < KMEANS_STABILITY_MIN) {
      riskScale *= 0.85;
      if(cooldown < 1) cooldown = 1;
    }
#endif

    rl.updateReward(snap.meanScore);
    rl.lastAction = rl.chooseAction(updateCount);

    int baseTopK = TOP_K;
    if(rl.lastAction == 0) baseTopK = TOP_K - 2;
    else if(rl.lastAction == 1) baseTopK = TOP_K;
    else if(rl.lastAction == 2) baseTopK = TOP_K;
    else baseTopK = TOP_K - 1;

    double profileBias[5] = {1.00, 0.98, 0.99, 0.97, 1.02};
    scoreScale = (1.0 + 0.06 * (adaptiveGamma - 1.0) + 0.04 * (adaptiveAlpha - 1.0) - 0.04 * (adaptiveBeta - 1.0))
               * profileBias[STRATEGY_PROFILE] * riskScale;

    if(pca.dom > 0.60) baseTopK -= 1;
    if(pca.rot > 0.15) baseTopK -= 1;
#if USE_HMM
    if(hmm.regime == 2) baseTopK -= 1;
    if(cooldown > 0) baseTopK -= 1;
#if USE_KMEANS
    if(kmeans.regime == 2) baseTopK -= 1;
#endif
#elif USE_GMM
    if(gmm.bestRegime == 2) baseTopK -= 1;
#endif

    dynamicTopK = baseTopK;
    if(dynamicTopK < 1) dynamicTopK = 1;
    if(dynamicTopK > TOP_K) dynamicTopK = TOP_K;

    for(int i=0; i<nScores; i++) {
      double s = (double)scores[i] * scoreScale;
      if(s > 1.0) s = 1.0;
      if(s < 0.0) s = 0.0;
      scores[i] = (fvar)s;
    }
#else
    (void)snap; (void)scores; (void)nScores; (void)updateCount;
#endif
  }
};

// ---------------------------- Strategy ----------------------------

class CrowdAverseStrategy {
public:
  ExposureTable exposureTable;
  FeatureBufferSoA featSoA;
  OpenCLBackend openCL;

  SlabAllocator<fvar> corrMatrix;
  SlabAllocator<fvar> distMatrix;
  SlabAllocator<fvar> compactness;
  SlabAllocator<fvar> entropy;
  SlabAllocator<fvar> scores;

  SlabAllocator<float> featLinear;
  SlabAllocator<float> corrLinear;

  int barCount;
  int updateCount;
  StrategyController controller;
  HierarchicalClusteringModel hclust;
  CommunityDetectionModel comm;
  AutoencoderModel ae;
  NoveltyController novelty;
  SOMModel som;
  SOMPlaybook somPlaybook;

  CrowdAverseStrategy() : barCount(0), updateCount(0) {}

  void init() {
    printf("CrowdAverse_v13: Initializing...\n");

    exposureTable.init();
    featSoA.init(N_ASSETS, FEAT_WINDOW);

    corrMatrix.init(N_ASSETS * N_ASSETS);
    distMatrix.init(N_ASSETS * N_ASSETS);
    compactness.init(N_ASSETS);
    entropy.init(N_ASSETS);
    scores.init(N_ASSETS);

    featLinear.init(FEAT_N * N_ASSETS * FEAT_WINDOW);
    corrLinear.init(N_ASSETS * N_ASSETS);

    openCL.init();
    printf("CrowdAverse_v13: Ready (OpenCL=%d)\n", openCL.ready);
    controller.init();
    hclust.init();
    comm.init();
    ae.init();
    novelty.init();
    som.init();
    somPlaybook.init();

    barCount = 0;
    updateCount = 0;
  }

  void shutdown() {
    printf("CrowdAverse_v13: Shutting down...\n");

    openCL.shutdown();

    featSoA.shutdown();
    corrMatrix.shutdown();
    distMatrix.shutdown();
    compactness.shutdown();
    entropy.shutdown();
    scores.shutdown();

    featLinear.shutdown();
    corrLinear.shutdown();
  }

  void computeFeatures(int assetIdx) {
    asset((char*)ASSET_NAMES[assetIdx]);

    vars C = series(priceClose(0));
    vars V = series(Volatility(C, 20));

    if(Bar < 50) return;

    fvar r1 = (fvar)log(C[0] / C[1]);
    fvar rN = (fvar)log(C[0] / C[12]);
    fvar vol = (fvar)V[0];
    fvar zscore = (fvar)((C[0] - C[50]) / (V[0] * 20.0 + EPS));
    fvar rangeP = (fvar)((C[0] - C[50]) / (C[0] + EPS));
    fvar flow = (fvar)(r1 * vol);
    fvar regime = (fvar)((vol > 0.001) ? 1.0 : 0.0);
    fvar volOfVol = (fvar)(vol * vol);
    fvar persistence = (fvar)fabs(r1);

    featSoA.push(0, assetIdx, r1);
    featSoA.push(1, assetIdx, rN);
    featSoA.push(2, assetIdx, vol);
    featSoA.push(3, assetIdx, zscore);
    featSoA.push(4, assetIdx, rangeP);
    featSoA.push(5, assetIdx, flow);
    featSoA.push(6, assetIdx, regime);
    featSoA.push(7, assetIdx, volOfVol);
    featSoA.push(8, assetIdx, persistence);
  }

  fvar computeEntropy(int assetIdx) {
    fvar mean = 0;
    for(int t=0; t<FEAT_WINDOW; t++) mean += featSoA.get(0, assetIdx, t);
    mean /= FEAT_WINDOW;
    fvar var = 0;
    for(int t=0; t<FEAT_WINDOW; t++) { fvar d = featSoA.get(0, assetIdx, t) - mean; var += d*d; }
    return (fvar)(var / FEAT_WINDOW);
  }

  void computeCorrelationMatrixCPU() {
    for(int i=0;i<N_ASSETS*N_ASSETS;i++) corrMatrix[i] = 0;

    for(int f=0; f<FEAT_N; f++){
      for(int a=0; a<N_ASSETS; a++){
        for(int b=a+1; b<N_ASSETS; b++){
          fvar mx = 0, my = 0;
          for(int t=0; t<FEAT_WINDOW; t++){
            mx += featSoA.get(f,a,t);
            my += featSoA.get(f,b,t);
          }
          mx /= (fvar)FEAT_WINDOW;
          my /= (fvar)FEAT_WINDOW;

          fvar sxx = 0, syy = 0, sxy = 0;
          for(int t=0; t<FEAT_WINDOW; t++){
            fvar dx = featSoA.get(f,a,t) - mx;
            fvar dy = featSoA.get(f,b,t) - my;
            sxx += dx*dx;
            syy += dy*dy;
            sxy += dx*dy;
          }

          fvar den = (fvar)sqrt((double)(sxx*syy + (fvar)EPS));
          fvar corr = 0;
          if(den > (fvar)EPS) corr = sxy / den;
          else corr = 0;

          int idx = a*N_ASSETS + b;
          corrMatrix[idx] += corr / (fvar)FEAT_N;
          corrMatrix[b*N_ASSETS + a] = corrMatrix[idx];
        }
      }
    }
  }

  void buildFeatLinear() {
    int idx = 0;
    for(int f=0; f<FEAT_N; f++){
      for(int a=0; a<N_ASSETS; a++){
        for(int t=0; t<FEAT_WINDOW; t++){
          featLinear[idx] = (float)featSoA.get(f, a, t);
          idx++;
        }
      }
    }
  }

  void computeCorrelationMatrix() {
    if(openCL.ready) {
      buildFeatLinear();

      for(int i=0;i<N_ASSETS*N_ASSETS;i++) corrLinear[i] = 0.0f;

      int ok = openCL.computeCorrelationMatrixCL(
        featLinear.data,
        corrLinear.data,
        N_ASSETS,
        FEAT_N,
        FEAT_WINDO

Last edited by TipmyPip; 4 hours ago.

ZorroGPT - https://bit.ly/3Gbsm4S

Entire Thread
Subject	Posted By	Posted
ZorroGPT	TipmyPip	11/19/23 11:26
Re: Zorro Trader GPT	thumper14	11/21/23 15:14
Re: Zorro Trader GPT	TipmyPip	11/22/23 05:08
Re: Zorro Trader GPT	AndrewAMD	11/22/23 12:25
Re: Zorro Trader GPT	TipmyPip	11/22/23 14:11
Re: Zorro Trader GPT	AndrewAMD	11/22/23 15:24
Re: Zorro Trader GPT	TipmyPip	11/22/23 16:53
Re: Zorro Trader GPT	NewtraderX	12/02/23 02:15
Re: Zorro Trader GPT	TipmyPip	12/02/23 06:10
Re: Zorro Trader GPT	fairtrader	12/04/23 09:16
Re: Zorro Trader GPT	TipmyPip	12/04/23 11:34
Re: Zorro Trader GPT	NewtraderX	12/10/23 20:09
Re: Zorro Trader GPT	TipmyPip	12/13/23 11:01
Re: Zorro Trader GPT	TipmyPip	12/28/23 14:22
Re: Zorro Trader GPT	scatters	02/13/24 16:24
Re: Zorro Trader GPT	TipmyPip	02/16/24 06:53
ZorroTraderGPT Update (2.6)	TipmyPip	03/06/24 09:27
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:43
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:53
Re: Zorro Trader GPT	TipmyPip	04/01/24 11:59
Re: Zorro Trader GPT	TipmyPip	04/01/24 12:06
Re: Zorro Trader GPT	TipmyPip	04/01/24 12:23
Re: Zorro Trader GPT	TipmyPip	04/01/24 21:39
Re: Zorro Trader GPT	Smon	04/05/24 04:51
Re: Zorro Trader GPT	TipmyPip	04/05/24 06:35
Re: Zorro Trader GPT	Smon	04/06/24 05:12
Re: Zorro Trader GPT	TipmyPip	04/27/24 13:50
Re: Zorro Trader GPT	TipmyPip	04/06/24 08:11
Re: Zorro Trader GPT	TipmyPip	07/06/24 15:23
Gaussian Channel Adaptive Strategy	TipmyPip	07/06/24 15:31
Re: Gaussian Channel Adaptive Strategy	M_D	01/13/25 01:47
Re: Gaussian Channel Adaptive Strategy	TipmyPip	01/13/25 05:33
Multi-Factor Gaussian FX Strategy	TipmyPip	07/06/24 15:56
Gaussian Bands Strategy	TipmyPip	07/06/24 17:16
Gaussian Decision Tree Hedging Strategy	TipmyPip	07/11/24 05:43
Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	07/20/24 05:00
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	07/20/24 06:46
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	firecrest	10/01/24 10:42
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	10/04/24 19:33
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	firecrest	11/07/24 07:53
Re: Gaussian-Enhanced Hybrid Ensemble Strategy	TipmyPip	11/09/24 16:42
PyTorch DRL with ZorroGPT	TipmyPip	11/09/24 16:52
DRL direct Feed from Zorro	TipmyPip	11/09/24 17:07
Multi Agent DRL with Simple Stategy	TipmyPip	11/10/24 10:36
Delta Dynamics	TipmyPip	12/24/24 18:36
Delta Cycle Dynamics	TipmyPip	12/24/24 18:40
Î£ Î¦ - Î” Îž (t), Î£ Î¦ Î” Îž (âˆ«f(Ï„)dÏ„)	TipmyPip	12/24/24 19:47
Profit Path Conundrum	TipmyPip	12/25/24 14:23
Stochastic Correlation in Currency Pairs	TipmyPip	12/25/24 15:34
PCA and Stochastic Volatility	TipmyPip	12/25/24 17:43
Re: PCA and Stochastic Volatility	vicknick	01/02/25 06:37
Re: PCA and Stochastic Volatility	TipmyPip	01/02/25 06:51
Entangled Kernel Arbitrage	TipmyPip	01/02/25 09:02
Kernel Volatility Arbitrage Enigma	TipmyPip	01/02/25 09:36
The Volatility Feedback Arbitrage	TipmyPip	01/02/25 10:10
Kernelized Profitability Arbitrage	TipmyPip	01/02/25 10:39
Recursive Kernel PCA with GNN for Profitability Optimization	TipmyPip	01/02/25 12:39
Multi-GNN Recursive Kernel PCA	TipmyPip	01/02/25 18:41
Volatility-Driven Graph Signal Strategy	TipmyPip	01/07/25 11:17
Volatility Graph Dynamics	TipmyPip	01/07/25 12:10
Synergistic Graph-PCA Arbitrage (SGPA)	TipmyPip	01/09/25 17:57
Decoding the Dynamics of Volatility Interdependencies	TipmyPip	01/09/25 18:20
Quantum Market Entanglement Problem	TipmyPip	01/09/25 18:48
Stochastic Interdependent Volatility-Adaptive Signal	TipmyPip	01/09/25 19:46
Time-Series Volatility Clustering and Adaptive Trading Signals	TipmyPip	01/09/25 20:10
The Hydra's Awakening	TipmyPip	01/25/25 14:09
Re: The Hydra's Awakening	TipmyPip	01/26/25 19:53
Re: The Hydra's Awakening	OptimusPrime	01/30/25 00:38
Re: The Hydra's Awakening	TipmyPip	01/30/25 03:51
Re: Zorro Trader GPT	TipmyPip	01/30/25 04:43
The War of Shifting Fronts	TipmyPip	01/30/25 05:31
Re: Zorro Trader GPT	OptimusPrime	01/30/25 07:12
Hard Question	TipmyPip	01/30/25 07:37
Re: Hard Question	jcl	01/30/25 10:43
The War of Shifting Fronts (Part 2)	TipmyPip	01/30/25 12:58
The War of Shifting Fronts (Part 3)	TipmyPip	01/30/25 14:13
Re: Zorro Trader GPT	TipmyPip	01/30/25 17:24
Risk Diversification in Portfolio Optimization	TipmyPip	01/31/25 10:58
The Language of Symbols	TipmyPip	02/02/25 17:56
Portfolio Dynamically Allocates Capita	TipmyPip	02/11/25 15:37
Tale of the Five Guardians	TipmyPip	02/14/25 06:58
Re: Tale of the Five Guardians	M_D	02/19/25 01:25
Multi-File Converting csv to .t6	TipmyPip	02/14/25 19:42
The Lost Computation of Zorropolis	TipmyPip	02/19/25 03:24
Graph-Enhanced Directional Trading (GEDT)	TipmyPip	02/19/25 22:48
Optimal Execution Under Incomplete Information	TipmyPip	02/20/25 09:56
Re: Zorro Trader GPT	TipmyPip	02/20/25 23:12
Markov Chain and Stochastic Asset Transitions	TipmyPip	03/16/25 12:37
VWAP Indicator for Zorro	TipmyPip	03/26/25 23:22
ZorroGPT	TipmyPip	05/23/25 09:22
Market Manipulation Index (MMI) for Zorro	TipmyPip	07/09/25 13:41
enhMMI indicator	TipmyPip	07/20/25 06:53
Re: enhanced MMI	dBc	08/06/25 17:15
Re: enhanced MMI	TipmyPip	08/08/25 18:56
multi-timeframe “Market Mode Index”	TipmyPip	08/09/25 01:33
Murrey Math Lines	TipmyPip	08/21/25 05:43
The Strategy of Spiritual Love.	TipmyPip	09/01/25 17:20
The Breach of Algorithms	TipmyPip	09/01/25 18:14
Proportional Rule-Switching Agents (PRSA)	TipmyPip	09/01/25 21:35
Gate-and-Field Adaptive Engine (GFAE)	TipmyPip	09/04/25 16:56
Gate-and-Flow Adaptive Navigator	TipmyPip	09/06/25 00:26
Regime-Responsive Graph Rewiring of Influences	TipmyPip	09/06/25 19:35
Canticle of the Rewoven Mandala	TipmyPip	09/14/25 12:15
Canticle of the Rewoven Mandala	TipmyPip	09/14/25 12:28
Consensus Gate Orchestrator	TipmyPip	09/27/25 10:02
Consensus Gate Orchestrator (continue)	TipmyPip	09/27/25 10:05
The Serpent of Skew	TipmyPip	10/12/25 13:58
Re: The Serpent of Skew	turbodom	10/20/25 20:54
Re: The Serpent of Skew	TipmyPip	10/22/25 05:51
Re: The Serpent of Skew	TipmyPip	10/23/25 21:05
Empirical Analysis of Asset Prices	TipmyPip	11/08/25 22:28
Empirical Analysis of Asset Prices (with NN)	TipmyPip	11/09/25 08:39
Re: Empirical Analysis of Asset Prices (with NN)	TipmyPip	11/10/25 11:04
SAR MOMENTUM BOT	TipmyPip	11/19/25 10:10
Adaptive Slope Trend	TipmyPip	11/29/25 08:58
RegSlope Adaptive Control Slope Trend	TipmyPip	11/29/25 09:38
Entropy Edge Learner (EEL)	TipmyPip	01/20/26 07:17
EigenGlyph Cascade	TipmyPip	01/22/26 07:08
Twin Horizon Bandit Weaver	TipmyPip	01/22/26 09:32
Fractal Bandit Compass	TipmyPip	01/22/26 12:25
TwinPulse Fractal Accord	TipmyPip	01/24/26 03:55
ScaleWeave Sentinel	TipmyPip	01/24/26 04:10
Runge–Kutta methods	TipmyPip	01/24/26 07:26
RK-382A Neural ODE	TipmyPip	01/27/26 06:01
The Stage Shuffle Cache Saver	TipmyPip	01/28/26 09:08
The Next Stage.	TipmyPip	01/28/26 09:48
The Clockwork Storm Weaver	TipmyPip	01/31/26 08:02
The Clockwork Storm Weaver (64Bit)	TipmyPip	01/31/26 09:23
The Clockwork Storm Weaver (OpenCL)	TipmyPip	01/31/26 09:49
The Clockwork Storm Weaver (CUDA)	TipmyPip	01/31/26 10:01
TorchBridge LineWorld Learner (Cuda Torch)	TipmyPip	01/31/26 17:11
The Candle Oracle Lattice	TipmyPip	02/03/26 10:19
The Candle Oracle Lattice (cont.)	TipmyPip	02/03/26 10:21
The Candle Oracle Lattice (CUDA version)	TipmyPip	02/03/26 12:58
The Candle Oracle Lattice (CUDA version cont.)	TipmyPip	02/03/26 13:02
Re: The Candle Oracle Lattice (CUDA version cont.)	qin	02/04/26 23:52
The Candle Oracle Lattice (OpenCL version)	TipmyPip	02/05/26 09:39
The Candle Oracle Lattice (OpenCL cont.)	TipmyPip	02/05/26 11:14
Wiener index W(g)	TipmyPip	02/21/26 18:06
Microstructure Networks	TipmyPip	02/21/26 18:26
Decision Making Graph Geometry	TipmyPip	02/21/26 19:15
KnotScope FX	TipmyPip	02/23/26 18:47
CrowdAverse	TipmyPip	02/23/26 18:50
RegimeAtlas	TipmyPip	02/23/26 18:53
VolSieve	TipmyPip	02/23/26 18:56
MomentumBias Nexus	TipmyPip	02/23/26 18:59
NexusWeave Compact Dominant	TipmyPip	02/23/26 19:02
HermitNet FX	TipmyPip	02/23/26 19:08
AstraRegime Nexus	TipmyPip	02/23/26 19:57
NexusVol Navigator	TipmyPip	02/23/26 19:59
GraphPulse MomentumBias	TipmyPip	02/23/26 20:01
GraphWeaver-CL (openCL)	TipmyPip	02/23/26 20:57
CrowdAverseCL (openCL)	TipmyPip	02/23/26 21:07
RegimeWeaver (OpenCL)	TipmyPip	02/23/26 21:43
AetherWeave-28 (OpenCL)	TipmyPip	02/23/26 21:49
“Atlas Momentum Mesh” (OpenCL)	TipmyPip	02/23/26 21:52
EquiPulse DualGate	TipmyPip	02/25/26 12:15
CompactClaw Atlas	TipmyPip	02/25/26 21:36
CrowdAverse Nexus v4 (RL)	TipmyPip	02/25/26 21:44
GraphForge Regime Weaver v4 (RL)	TipmyPip	02/25/26 21:51
Momentum Loom Nexus	TipmyPip	02/25/26 22:00
NebulaSwitch Matrix v4 (RL)	TipmyPip	02/25/26 22:12
Momentum Loom Conductor v5 (RL)	TipmyPip	02/25/26 22:55
CompactClaw Constellation v5 (RL)	TipmyPip	02/25/26 23:07
CrowdAverse Nexus v5 (RL)	TipmyPip	02/25/26 23:21
PrismWeave Regime Switcher v5 (RL)	TipmyPip	02/26/26 11:26
VolWeave Constellation v5 (RL)	TipmyPip	02/26/26 11:29
Compactness Crown v6 (RL)	TipmyPip	02/26/26 11:39
CrowdAverse Prism v6 (RL)	TipmyPip	02/26/26 11:42
PrismLattice Switcher v6 (RL)	TipmyPip	02/26/26 11:47
VolSynapse Constellation v6 (RL)	TipmyPip	02/26/26 11:50
Momentum Loom Nexus v6 (RL)	TipmyPip	02/26/26 11:53
CompactCrown Matrix v7 (RL)	TipmyPip	02/26/26 11:56
CrowdAverse Prism v7 (RL)	TipmyPip	02/26/26 11:59
PrismWeave Regime Switcher v7 (RL)	TipmyPip	02/26/26 12:02
VolEdge Nexus v7 (RL)	TipmyPip	02/26/26 12:05
Momentum Loom Atlas v7 (RL)	TipmyPip	02/26/26 12:08
CompactPulse Constellation v8 (RL)	TipmyPip	02/26/26 12:14
CompactCrown Nexus v9 (RL)	TipmyPip	02/26/26 12:17
CrowdAverse Prism Nine v9 (RL)	TipmyPip	02/26/26 16:13
PrismWeave Regime Switcher v9 (RL)	TipmyPip	02/26/26 16:20
VolNet Sentinel Nine v9 (RL)	TipmyPip	02/26/26 16:25
PrismWeave Momentum Atlas v9 (RL)	TipmyPip	02/26/26 16:29
CompactCrown Navigator v10 (RL)	TipmyPip	02/26/26 16:34
CrowdAverse Nexus v10 (RL)	TipmyPip	02/26/26 16:37
PrismSwitch Nexus v10 (RL)	TipmyPip	02/26/26 16:43
VolMesh Sentinel v10 (RL)	TipmyPip	02/26/26 16:59
Momentum Weave Nexus v10 (RL)	TipmyPip	02/26/26 17:02
CompactClaw Nexus v11 (RL)	TipmyPip	02/26/26 17:06
CrowdAverse Lattice Engine	TipmyPip	02/26/26 17:11
Re: CrowdAverse Lattice Engine	TipmyPip	02/26/26 17:55
FractalViewport Conductor	TipmyPip	02/26/26 21:35
OrchidSwitch Nexus v11 (RL)	TipmyPip	02/27/26 13:15
VolCorr Atlas v.11 (RL)	TipmyPip	02/27/26 16:12
Momentum Weave Conductor v.11 (RL)	TipmyPip	02/27/26 16:22
NeuroWeave CompactNet v.12 (RL)	TipmyPip	02/27/26 16:31
CrowdAverse Neural Conductor v.12 (RL)	TipmyPip	02/27/26 16:40
Outperformance	TipmyPip	02/27/26 22:06
NeuroGraph Regime Compass v12 (RL)	TipmyPip	03/01/26 16:32
NeuroWeave VolAtlas v12 (RL)	TipmyPip	03/01/26 16:36
NeuroLattice Momentum Engine v12 (RL)	TipmyPip	03/01/26 16:40
Neural Prism Renderer	TipmyPip	03/02/26 16:21
TorchBridge Pixel Loom	TipmyPip	03/02/26 16:55
Stochastic TorchCanvas Bridge	TipmyPip	03/02/26 18:02
NeuroWeave Render Bridge	TipmyPip	03/02/26 18:13
CompactDominant Atlas v13 (RL)	TipmyPip	5 hours ago
CompactDominant Atlas v13 (RL) cont.	TipmyPip	4 hours ago
CrowdAverse Nexus Engine v13 (RL)	TipmyPip	4 hours ago
CrowdAverse Nexus Engine v13 (RL) cont.	TipmyPip	4 hours ago

Moderated by Petra