This program is a small world in a bottle, built to watch a hidden choir of states move through time. At the start, a seed is chosen like a signature ring, so the same story can be replayed or a new one can be born. A private dice maker lives inside the file, separate from the host environment, turning that seed into a steady stream of chance. From that stream come whispers of randomness that later become fog, static, and sudden shocks.

The world has two faces, called regimes, like day and night. A quiet gatekeeper decides when the face changes, using the current stress of the outside wind as a hint. The outside wind is the exogenous input, a trio of slow waves that bend and drift. This wind does not merely push; it also reshapes the rules of motion. The rules, named theta in the code, are like mood settings that alter damping, coupling strength, and stiffness. When the mood shifts, certain lanes of the hidden choir become harder to move.

The core motion is computed by a staged ritual, a stepper that samples the drift field several times before committing to the next position. Each stage prepares a temporary portrait of the current state, passes it through a softsign mask to keep extremes tame, then blends every voice with every other voice through a dense web of sines. This web is the symbolic neural fabric: it is not trained here, but it behaves like a synthetic network that couples many channels into one flowing response.

To walk safely, the traveler uses an adaptive stride. It tries a full stride and also two half strides, then compares the outcomes. If the difference is small enough, the stride is accepted; if not, the stride is trimmed and tried again. Memory pressure is treated like weather. When memory rises beyond a budget, the traveler becomes less picky about perfection, limits the largest stride, and tightens patience for endless retrials.

After each accepted drift step, the story becomes hybrid. A layer of diffusion is painted onto every coordinate, scaled by the wind, by the current face of the world, and by the magnitude of the state itself. Then, with a rare roll of the internal dice, a jump may strike: a heavy tailed shock clipped to protect the narrative from blowing apart. These jumps touch only a random subset of coordinates, making the path jagged in places.

The world is only partly visible. An observer takes a handful of channels, mixes them, smooths them, and adds a small measurement hiss. The simulation watches both the hidden truth and the noisy report, and it will stop early if either grows unrealistically large or if the drift grows quiet for long enough to be called steady.

Finally, three travelers are sent through the same world. One uses the reference ritual. One uses a carefully permuted ritual that should be equivalent. One uses a careless permutation that is subtly wrong. Their destinations are compared, and the printout reveals whether the refactor preserved meaning or merely rearranged the words.

It is a tale of structure, randomness, and bookkeeping.

Code
// demo_neuralODE_complex_382A_adaptive_main.c (Zorro lite-C)
//
// What this program demonstrates
// ------------------------------
// This is a self-contained simulation of a high-dimensional (N=48) nonlinear
// *hybrid* dynamical system:
//
//   - Deterministic drift:     y' = f_theta(t, y, u(t))
//   - Stochastic diffusion:    additive/multiplicative noise (Euler-style increment)
//   - Random jump shocks:      Poisson-triggered heavy-tailed jumps
//   - Regime switching:        2-state Markov regime (0/1) with state-dependent rates
//   - Partial observability:   produce noisy observed channels x_obs from hidden state y
//
// Numerical method
// ----------------
// The drift (ODE part) is integrated by an explicit Runge–Kutta method with
// adaptive step size using step-doubling:
//
//   full step  : y(t+dt)   from one RK step of size dt
//   half steps : y(t+dt)   from two RK steps of size dt/2
//   error      : ||half - full||  -> accept/reject and adapt dt
//
// If accepted, we then apply (in this order):
//   1) update regime state (Markov transition)
//   2) diffusion increment (noise)
//   3) jump event (rare, heavy-tailed)
//   4) produce observations x_hat and noisy x_obs
//
// What "382A" is about in this file
// --------------------------------
// We build one 4-stage explicit RK tableau. Then we create:
//
//   A) Reference tableau (baseline)
//   B) "Safe" stage refactor: a stage permutation mapped into an equivalent tableau
//      (via permutation of the Butcher coefficients). This should produce identical
//      results up to numerical tolerance.
//   C) "Buggy" stage refactor: a naive permutation that changes the evaluation order
//      without correctly transforming the tableau. This generally changes results.
//
// The program integrates the same initial condition with A/B/C and prints the
// RMS differences vs the reference.
//
// Fixed seeding note
// ------------------
// Zorro has its own RNG (seed(), random()). This simulation DOES NOT use it.
// Instead it uses a custom LCG-based RNG (G_Rng via rng_u01/rng_norm).
// Therefore:
//   - We must seed our custom RNG explicitly (rng_seed(seed)).
//   - We generate a time-varying integer seed in main() and pass it into the
//     integrator. This avoids repeated identical runs when started quickly.

#define MAXS 4
#define TRI_A 6

// -------------------- Dimensions --------------------
// N: hidden state dimension (the "neural" state we integrate)
#define N         48
// K_SIZE: flattened storage for stage derivatives, sized for MAXS stages
#define K_SIZE    192   // MUST be literal: MAXS*N = 4*48 = 192

// u(t): exogenous forcing input (small dimension)
#define U_DIM     3
// partial observation vector size (we observe only a subset of hidden channels)
#define NOBS      12

// -------------------- Base integration settings --------------------
// Final time horizon for integration
#define T_END     20.0
// Starting step size guess for adaptive integration
#define DT_INIT   0.20
// Hard floor and ceiling for the step size
#define DT_MIN    1e-6
#define DT_MAX    0.50
// Base error tolerance for step accept/reject
#define TOL       1e-6

// -------------------- Memory shedding (pressure response) --------------------
// If the process memory rises above this budget, loosen accuracy + reduce dtMax.
// This is a defensive mechanism for long runs / limited environments.
#define MEM_BUDGET_MB 2000
static int G_ShedTriggered = 0;

// Runtime-adjustable knobs (can be modified when shedding triggers)
static var G_Tol   = TOL;
static var G_DtMax = DT_MAX;

// Budgets to stop pathological runs (NOT a strict max-step cap)
// - MaxRejects: how many rejected steps we tolerate before aborting
// - DtMinHitMax: how often we hit dt == DT_MIN before aborting
static int G_MaxRejects     = 2000000;
static int G_DtMinHitMax    = 500000;

// Dynamics-driven termination conditions
// - "steady" means drift magnitude stays tiny for many accepted steps
static var G_SteadyEps      = 1e-7;     // RMS(||y_dot||) threshold
static int G_SteadyNeed     = 250;      // consecutive accepted steps below threshold
// - "blow-up" means state or observed channels become numerically huge
static var G_StateLimitRMS  = 1e6;      // stop if RMS(||y||) explodes
static var G_ObsLimitRMS    = 1e6;      // stop if RMS(||x_obs||) explodes

// -------------------- Stochastic / hybrid parameters --------------------
// Baseline diffusion scale
static var G_SigmaBase = 0.02;
// Baseline jump (shock) intensity and magnitude
static var G_JumpLambdaBase = 0.20;
static var G_JumpMagBase = 0.20;
// Regime switching base rates: 0->1 and 1->0
static var G_RegRate01 = 0.15;
static var G_RegRate10 = 0.10;

// -------------------- Optional memory-tight types --------------------
// Use float for large buffers to reduce memory bandwidth and footprint.
#define TIGHT_MEM 1
#ifdef TIGHT_MEM
typedef float fvar;   // 4 bytes
#else
typedef var fvar;     // 8 bytes (Zorro var)
#endif

// -------------------- Global working buffers (no malloc) --------------------
// RK stage derivatives K_i for each stage i (flattened by stage and component)
static fvar G_K[K_SIZE];
// activation buffer for nonlinear transform of stage state
static fvar G_act[N];

// Stage state input to drift f_theta(t, y)
// and drift output (dy/dt) for the current evaluation
static var  G_stage[N];
static var  G_dh[N];

// Integrator work buffers for step-doubling
static var  G_ycur[N];   // current accepted state
static var  G_full[N];   // one full dt step
static var  G_half[N];   // two half steps combined
static var  G_mid[N];    // intermediate state after first half step

// Initial / final state buffers for the three methods compared
static var  Y0[N];
static var  Yref[N];
static var  Ysafe[N];
static var  Ybug[N];

// Partial observability: predicted clean observation + noisy measurement
static var  G_xhat[NOBS];
static var  G_xobs[NOBS];

// Exogenous input u(t)
static var  G_u[U_DIM];

// -------------------- Custom RNG (deterministic, local) --------------------
// We use our own RNG so results do not depend on Zorro's global RNG state.
static unsigned int G_Rng = 1;

// Seed the custom RNG. Seed must be non-zero.
void rng_seed(unsigned int s)
{
  if(s == 0) s = 1;
  G_Rng = s;
}

// Uniform random in (0,1). Uses a classic LCG update.
var rng_u01()
{
  G_Rng = 1664525u * G_Rng + 1013904223u;
  unsigned int x = (G_Rng >> 8) & 0x00FFFFFFu;
  return ((var)x + 1.0) / 16777217.0; // strictly (0,1)
}

// Standard normal N(0,1) using Box–Muller transform
var rng_norm()
{
  var u1 = rng_u01();
  var u2 = rng_u01();
  var r = sqrt(-2.0 * log(u1));
  var th = 6.283185307179586 * u2;
  return r * cos(th);
}

// -------------------- Utility: memory and shedding --------------------
// Return current process memory usage in MB (best-effort).
int memMB()
{
  var m = memory(0);
  if(!(m > 0)) return 0;
  return (int)(m/(1024*1024));
}

// If memory exceeds a budget, reduce work/pressure by:
//   - loosening tolerance (less strict accuracy)
//   - lowering max dt (more stable steps in volatile regions)
//   - tightening reject/dt-min budgets (avoid endless loops)
void shed_if_needed()
{
  if(G_ShedTriggered) return;

  int mb = memMB();
  if(mb >= MEM_BUDGET_MB)
  {
    G_ShedTriggered = 1;

    // accept larger local error to reduce computation
    G_Tol *= 10.0;

    // cap maximum dt to reduce instability after shedding
    G_DtMax *= 0.5;
    if(G_DtMax < DT_MIN) G_DtMax = DT_MIN;

    // reduce budgets to fail faster if the run is unhealthy
    G_MaxRejects  = (int)(G_MaxRejects/2);
    if(G_MaxRejects < 10000) G_MaxRejects = 10000;

    G_DtMinHitMax = (int)(G_DtMinHitMax/2);
    if(G_DtMinHitMax < 10000) G_DtMinHitMax = 10000;
  }
}

// -------------------- RK tableau helpers (explicit lower triangle storage) --------------------
// We store only the strictly lower triangular part of A (since explicit RK has A[i][j]=0 for j>=i).
// For MAXS=4, the number of stored entries is TRI_A = 4*3/2 = 6.

// Map (i,j) with i>j to index into the packed triangular array
int Aidx(int i, int j)
{
  return (i*(i-1))/2 + j;
}

// Read A(i,j) from packed storage (returns 0 for i<=j)
var getA(float* Atri, int i, int j)
{
  if(i <= j) return 0;
  return (var)Atri[Aidx(i,j)];
}

// Write A(i,j) into packed storage (ignored for i<=j)
void setA(float* Atri, int i, int j, var v)
{
  if(i <= j) return;
  Atri[Aidx(i,j)] = (float)v;
}

// Reset a tableau to zero (s, A, b, c)
void zeroTableau(int* s, float* Atri, float* b, float* c)
{
  int i;
  *s = 0;
  for(i=0;i<TRI_A;i++) Atri[i] = 0;
  for(i=0;i<MAXS;i++) { b[i]=0; c[i]=0; }
}

// -------------------- Nonlinearity used throughout --------------------
// softsign bounds values to (-1,1) and prevents runaway growth vs tanh/sigmoid costs.
var softsign(var z)
{
  return z / (1.0 + abs(z));
}

// -------------------- Exogenous input u(t) --------------------
// u(t) is a small external driving signal that affects both drift and stochastic terms.
// Regime changes can also offset u0 to create different behavior.
static int G_Regime = 0; // 0 or 1

void eval_u(var t)
{
  var u0 = sin(0.11*t);
  u0 *= 0.8;
  if(G_Regime) u0 += 0.35;

  var u1 = cos(0.37*t);
  u1 *= 0.6;

  var u2 = sin(2.30*t);
  u2 *= 0.3;

  *(G_u + 0) = u0;
  *(G_u + 1) = u1;
  *(G_u + 2) = u2;
}

// -------------------- Time-varying drift parameters theta(t) --------------------
// These parameters modulate the drift field over time and with u(t):
//   damp  : linear damping coefficient
//   amp   : forcing amplitude
//   scale : coupling scale in the interaction sum
//   stiff : stiffening factor for certain coordinates (multiscale behavior)
void theta_t(var t, var* dampOut, var* ampOut, var* scaleOut, var* stiffOut)
{
  // baseline values
  var damp  = -0.40;
  var amp   =  0.35;
  var scale =  0.06;

  // slow temporal drift in parameters
  var drift = sin(0.03*t);
  drift *= 0.25;

  // use exogenous input channels to modulate parameters
  var u0 = 0 + *(G_u + 0);
  var u1 = 0 + *(G_u + 1);

  // regime-dependent scaling
  if(G_Regime)
  {
    amp   *= 1.15;
    scale *= 1.10;
    damp  *= 1.05;
  }

  // combine time drift + inputs into parameter modulation
  amp   *= (1.0 + 0.30*drift + 0.15*u0);
  scale *= (1.0 + 0.20*drift + 0.10*u1);
  damp  *= (1.0 + 0.15*drift - 0.10*u0);

  // stiff factor applied to selected coordinates (creates fast/slow modes)
  var stiff = 2.8;

  *(dampOut)  = damp;
  *(ampOut)   = amp;
  *(scaleOut) = scale;
  *(stiffOut) = stiff;
}

// -------------------- Regime switching (2-state Markov chain) --------------------
// The regime flips 0<->1 with probabilities derived from rates r01, r10.
// Rates increase under "stress", which depends on |u0| and |u1|.
void update_regime(var t, var dt)
{
  var u0 = 0 + *(G_u + 0);
  var u1 = 0 + *(G_u + 1);

  var stress = abs(u0) + 0.5*abs(u1);

  var r01 = G_RegRate01 * (1.0 + 0.8*stress);
  var r10 = G_RegRate10 * (1.0 + 0.5*stress);

  var p;
  if(G_Regime == 0)
  {
    p = 1.0 - exp(-r01*dt);
    if(rng_u01() < p) G_Regime = 1;
  }
  else
  {
    p = 1.0 - exp(-r10*dt);
    if(rng_u01() < p) G_Regime = 0;
  }
}

// -------------------- Jump shocks (Poisson events) --------------------
// With probability p ~ 1-exp(-lambda*dt), we apply a rare jump.
// Jump size uses tan(pi*(u-0.5)) which is heavy-tailed; we clamp extremes.
void apply_jump_if_needed(var t, var dt, var* y)
{
  var u0 = 0 + *(G_u + 0);
  var lam = G_JumpLambdaBase * (1.0 + 0.7*abs(u0));
  if(G_Regime) lam *= 1.25;

  var p = 1.0 - exp(-lam*dt);
  if(rng_u01() >= p) return;

  // heavy-tailed variate (Cauchy-like); clamp to keep numerics finite
  var uu = rng_u01() - 0.5;
  var j = tan(3.141592653589793 * uu);

  if(j >  6.0) j =  6.0;
  if(j < -6.0) j = -6.0;

  // jump magnitude depends on regime and |u0|
  var mag = G_JumpMagBase;
  mag *= (1.0 + 0.5*abs(u0));
  if(G_Regime) mag *= 1.2;

  // apply jump to a random subset of coordinates
  int k;
  for(k=0;k<N;k++)
  {
    if(rng_u01() < 0.25)
    {
      var delta = mag * j;
      var kk = (k+1);
      delta *= (1.0 + 0.01*kk);
      *(y + k) += delta;
    }
  }
}

// -------------------- Partial observability model --------------------
// Map hidden state h (N-dim) to a smaller observed vector (NOBS-dim),
// then add small Gaussian measurement noise.
// Produces:
//   G_xhat: "clean" observation prediction
//   G_xobs: noisy measurement (what an observer would see)
void observe_from_hidden(var t, var* h)
{
  int k;
  for(k=0;k<NOBS;k++)
  {
    // pick two hidden channels per observation
    int j = k;
    int j2 = k + NOBS;

    var a = 0 + *(h + j);
    var b = 0 + *(h + j2);

    // simple linear mix + a small deterministic wobble
    var z = a + 0.30*b;
    z += 0.05*sin(0.17*t + 0.3*(k+1));

    // bound the observation through softsign
    var xhat = z / (1.0 + abs(z));
    *(G_xhat + k) = xhat;

    // measurement noise
    var eta = 0.01 * rng_norm();
    *(G_xobs + k) = xhat + eta;
  }
}

// -------------------- Drift vector field f_theta(t, y, u) --------------------
// This computes G_dh = dy/dt for the current "stage state" stored in G_stage.
// Internally it:
//   - updates u(t)
//   - computes time-varying parameters theta(t)
//   - computes activation = softsign(stage + bias + injected input)
//   - computes a dense nonlinear coupling sum across all neurons (i over j)
//   - adds external forcing, damping, and a weak self-nonlinearity
void f_theta_vec(var t)
{
  int i,j;

  // refresh exogenous inputs at this time
  eval_u(t);

  // compute time-varying parameters
  var damp, amp, scale, stiff;
  theta_t(t, &damp, &amp, &scale, &stiff);

  // activation pass (store to float buffer for speed/memory)
  for(j=0;j<N;j++)
  {
    var jj = (j+1);

    var bj = sin(0.17*jj);
    bj *= 0.03;

    var u0 = 0 + *(G_u + 0);
    var inj = u0;
    inj *= 0.02;
    inj *= sin(0.07*jj + 0.19*t);

    var hj = 0 + *(G_stage + j);
    var in = hj + bj + inj;

    *(G_act + j) = (fvar)softsign(in);
  }

  // drift assembly per coordinate i
  for(i=0;i<N;i++)
  {
    var sum = 0;
    var ii = (i+1);

    // dense coupling term (nonlinear "neural" interaction)
    for(j=0;j<N;j++)
    {
      var jj = (j+1);

      var term1 = 0.013;
      term1 *= ii;
      term1 *= jj;

      var term2 = 0.11;
      term2 *= ii;

      var term3 = 0.07;
      term3 *= jj;

      var ang = term1 + term2 - term3;

      var w = sin(ang);

      var ww = w;
      ww *= scale;

      var aj = (var)(*(G_act + j));
      aj = 0 + aj;

      var prod = ww;
      prod *= aj;

      sum += prod;
    }

    // smooth forcing term + fast oscillation + exogenous mix
    var phi = 0.05;
    phi *= ii;

    var forcing = sin(t + phi);
    forcing *= amp;

    var fast = sin(20.0*t + 0.10*ii);
    fast *= 0.08;

    var u1 = 0 + *(G_u + 1);
    var u2 = 0 + *(G_u + 2);

    var exf = u1;
    exf *= 0.10;
    exf += 0.05*u2*sin(0.9*t);

    forcing += fast + exf;

    // damping and weak self-nonlinearity
    var hi = 0 + *(G_stage + i);

    var tmp = hi;
    tmp *= 1.8;
    var self = softsign(tmp);
    self *= 0.15;

    // stiffen every 8th coordinate to create multiscale dynamics
    var di = damp;
    if((i & 7) == 0)
      di *= stiff;

    var ghi = hi;
    ghi *= di;

    *(G_dh + i) = sum + ghi + forcing + self;
  }
}

// -------------------- Explicit RK step for drift only --------------------
// Given an explicit RK tableau (s, A, b, c), advance the ODE drift part:
//   yout = y + h * sum_i b_i * k_i
// where k_i = f(t + c_i*h, y + h * sum_{j<i} A_{i,j} k_j)
void rk_step_vec(int s, float* Atri, float* b, float* c,
                 var t, var* y, var h, var* yout)
{
  int i,j,k;

  // trivial fallback (should not happen in normal use)
  if(s < 1)
  {
    for(k=0;k<N;k++) *(yout + k) = *(y + k);
    return;
  }
  if(s > MAXS) s = MAXS;

  // clear stage derivative storage
  for(i=0;i<s;i++)
    for(k=0;k<N;k++)
      G_K[i*N + k] = (fvar)0;

  // stage loop
  for(i=0;i<s;i++)
  {
    // compute stage state: y_stage = y + h * sum_{j<i} A(i,j)*k_j
    for(k=0;k<N;k++)
    {
      var acc = 0 + *(y + k);

      for(j=0;j<i;j++)
      {
        var kij = (var)G_K[j*N + k];
        kij = 0 + kij;

        var aij = getA(Atri,i,j);

        var term = kij;
        term *= aij;
        term *= h;

        acc += term;
      }

      *(G_stage + k) = acc;
    }

    // evaluate drift at t + c_i*h
    var ci = (var)c[i];
    var dt = ci;
    dt *= h;

    f_theta_vec(t + dt);

    // store k_i = f(...)
    for(k=0;k<N;k++)
      G_K[i*N + k] = (fvar)(*(G_dh + k));
  }

  // combine stages into final yout
  for(k=0;k<N;k++)
  {
    var acc = 0 + *(y + k);

    for(i=0;i<s;i++)
    {
      var ki = (var)G_K[i*N + k];
      ki = 0 + ki;

      var bi = (var)b[i];

      var term = ki;
      term *= bi;
      term *= h;

      acc += term;
    }

    *(yout + k) = acc;
  }
}

// -------------------- Norm utilities --------------------
// RMS difference between two N-dim vectors
var err_norm(var* a, var* b)
{
  int k;
  var s = 0;
  for(k=0;k<N;k++)
  {
    var d = *(a + k) - *(b + k);
    s += d*d;
  }
  return sqrt(s / (var)N);
}

// RMS norm of a vector of length n
var rms_norm(var* a, int n)
{
  int k;
  var s = 0;
  for(k=0;k<n;k++)
  {
    var x = *(a + k);
    s += x*x;
  }
  return sqrt(s / (var)n);
}

// -------------------- Stage permutation -> equivalent tableau --------------------
// Given a method with (A,b,c) and a permutation p[] of stages,
// construct a new method (Aout,bout,cout) that corresponds to that permutation.
//
// Intuition:
//   - Reordering stages changes the evaluation sequence.
//   - If you correctly permute the Butcher coefficients, you can obtain an
//     equivalent method (same one-step map under exact arithmetic).
//
// This is the "safe refactor" path used for method B in main().
void permute_method_into(int s, float* A, float* b, float* c,
                         int* p,
                         int* sOut, float* Aout, float* bout, float* cout)
{
  int i,j;

  *sOut = s;
  if(*sOut < 0) *sOut = 0;
  if(*sOut > MAXS) *sOut = MAXS;

  for(i=0;i<TRI_A;i++) Aout[i] = 0;
  for(i=0;i<MAXS;i++)  { bout[i]=0; cout[i]=0; }

  // permute b and c
  for(i=0;i<*sOut;i++)
  {
    int pi = p[i];
    if(pi < 0) pi = 0;
    if(pi >= s) pi = s-1;
    bout[i] = b[pi];
    cout[i] = c[pi];
  }

  // permute A lower-triangular entries consistently
  for(i=0;i<*sOut;i++)
    for(j=0;j<i;j++)
    {
      int pi = p[i];
      int pj = p[j];
      if(pi < 0) pi = 0;
      if(pj < 0) pj = 0;
      if(pi >= s) pi = s-1;
      if(pj >= s) pj = s-1;
      setA(Aout, i, j, getA(A, pi, pj));
    }
}

// -------------------- Base 4-stage explicit RK with "swappable" middle stages --------------------
// This is a custom 4-stage explicit RK tableau (not classical RK4).
// The middle stages (1 and 2) can be swapped; the file then tests whether
// the swap is done correctly (B) or incorrectly (C).
void make_swappable_rk4(int* s, float* Atri, float* b, float* c)
{
  zeroTableau(s, Atri, b, c);
  *s = 4;

  c[0] = 0.0;
  c[1] = 0.30;
  c[2] = 0.80;
  c[3] = 1.00;

  setA(Atri, 1, 0, 0.30);
  setA(Atri, 2, 0, 0.80);

  setA(Atri, 3, 1, 0.50);
  setA(Atri, 3, 2, 0.50);

  b[0] = 0.10;
  b[1] = 0.20;
  b[2] = 0.30;
  b[3] = 0.40;
}

// -------------------- Diffusion increment (SDE part) --------------------
// After accepting a drift step, apply a stochastic increment:
//   y <- y + sigma(t,y) * sqrt(dt) * N(0,1)
// Here sigma depends on u(t), regime, and |y_k| (mild state-dependent scaling).
void apply_diffusion(var t, var dt, var* y)
{
  eval_u(t);

  var u0 = 0 + *(G_u + 0);
  var u2 = 0 + *(G_u + 2);

  // diffusion scale depends on inputs and regime
  var sigma = G_SigmaBase;
  sigma *= (1.0 + 0.6*abs(u0) + 0.3*abs(u2));
  if(G_Regime) sigma *= 1.25;

  var sdt = sqrt(dt);

  int k;
  for(k=0;k<N;k++)
  {
    var hk = 0 + *(y + k);

    // mild multiplicative factor increasing with |y_k|
    var ahk = abs(hk);
    ahk *= 0.15;
    var gmul = 1.0 + ahk;
    gmul = 0 + gmul;

    var xi = rng_norm();

    var inc = sigma;
    inc = inc * gmul;
    inc = inc * sdt;
    inc = inc * xi;

    *(y + k) += inc;
  }
}

// -------------------- Adaptive integrator for hybrid SDE/ODE --------------------
// Integrates from t0 to tEnd starting at y0.
//
// Core loop:
//   - Choose dt (bounded by DT_MIN and G_DtMax, clipped to end time)
//   - Compute drift step with RK: full step and two half steps
//   - Estimate error e = RMS(half - full)
//   - If accepted:
//       * commit y <- half
//       * update regime
//       * apply diffusion noise
//       * apply jumps
//       * produce observations
//       * check blow-up and steady-state termination
//       * update dt via PI-like scaling (here simple power law)
//     else:
//       * reject and halve dt
//
// Outputs:
//   yOut: final state
//   stepsOut / rejectsOut: accepted/rejected counts
//   dtMinOut: minimum dt observed during the run
//   stopReasonOut: why we terminated (see printed table in main)
void integrate_adaptive_sde(int s, float* A, float* b, float* c,
                            var t0, var* y0, var tEnd,
                            unsigned int seed,
                            var* yOut,
                            int* stepsOut, int* rejectsOut,
                            var* dtMinOut,
                            int* stopReasonOut)
{
  // seed the custom RNG and reset regime
  rng_seed(seed);
  G_Regime = 0;

  int k;
  var t = t0;
  var dt = DT_INIT;
  var dtMinSeen = 1e9;

  // initialize current state
  for(k=0;k<N;k++)
    *(G_ycur + k) = *(y0 + k);

  int steps = 0;
  int rejects = 0;
  int iters = 0;
  int dtMinHits = 0;
  int steadyCount = 0;

  int stopReason = 0;

  while(t < tEnd)
  {
    // occasionally check memory and relax constraints if needed
    if((iters & 255) == 0) shed_if_needed();
    iters++;

    // enforce dt bounds and do not step past tEnd
    if(dt < DT_MIN) dt = DT_MIN;
    if(dt > G_DtMax) dt = G_DtMax;
    if(t + dt > tEnd) dt = tEnd - t;

    // track smallest dt seen and how often we hit dt_min
    if(dt < dtMinSeen) dtMinSeen = dt;
    if(dt <= DT_MIN) dtMinHits++;

    // if we are stuck at dt_min too long, abort
    if(dtMinHits > G_DtMinHitMax)
    {
      stopReason = 4;
      break;
    }

    // drift integration: one full step
    rk_step_vec(s, A, b, c, t, G_ycur, dt, G_full);

    // drift integration: two half steps (step-doubling)
    var halfdt = dt;
    halfdt *= 0.5;

    rk_step_vec(s, A, b, c, t,        G_ycur, halfdt, G_mid);
    rk_step_vec(s, A, b, c, t+halfdt, G_mid,  halfdt, G_half);

    // error estimate from step-doubling
    var e = err_norm(G_half, G_full);

    // accept if error is within tolerance or we cannot reduce dt further
    if(e <= G_Tol || dt <= DT_MIN)
    {
      // commit state using the more accurate half-step result
      for(k=0;k<N;k++)
        *(G_ycur + k) = *(G_half + k);

      // hybrid / stochastic effects applied after acceptance
      eval_u(t);
      update_regime(t, dt);

      apply_diffusion(t, dt, G_ycur);
      apply_jump_if_needed(t, dt, G_ycur);

      // update observations at the end of this accepted step
      observe_from_hidden(t + dt, G_ycur);

      // blow-up guards (state or observations)
      var yr = rms_norm(G_ycur, N);
      var xr = rms_norm(G_xobs, NOBS);
      if(yr > G_StateLimitRMS || xr > G_ObsLimitRMS)
      {
        stopReason = 2;
        t += dt;
        steps++;
        break;
      }

      // steady-state detection using drift magnitude at new state
      for(k=0;k<N;k++) *(G_stage + k) = *(G_ycur + k);
      f_theta_vec(t + dt);
      var dr = rms_norm(G_dh, N);

      if(dr < G_SteadyEps) steadyCount++;
      else steadyCount = 0;

      // advance time and increment accepted step count
      t += dt;
      steps++;

      // terminate if we've stayed near-steady for long enough
      if(steadyCount >= G_SteadyNeed)
      {
        stopReason = 1;
        break;
      }

      // adapt dt upward/downward based on error (bounded multiplier)
      if(e < 1e-16)
      {
        dt *= 2.0;
      }
      else
      {
        var fac = pow(G_Tol / e, 0.2); // exponent ~ 1/(order+1)
        fac *= 0.9;                   // safety factor
        if(fac < 0.5) fac = 0.5;
        if(fac > 2.0) fac = 2.0;
        dt *= fac;
      }
    }
    else
    {
      // reject step and reduce dt
      rejects++;
      if(rejects > G_MaxRejects)
      {
        stopReason = 3;
        break;
      }
      dt *= 0.5;
    }
  }

  // return final state + run statistics
  for(k=0;k<N;k++)
    *(yOut + k) = *(G_ycur + k);

  *stepsOut = steps;
  *rejectsOut = rejects;
  *dtMinOut = dtMinSeen;
  *stopReasonOut = stopReason;
}

// -------------------- Entry point / experiment harness --------------------
// Builds three RK methods (reference, safe-permuted, buggy-permuted),
// integrates each with the same initial state and RNG seed, then reports:
//
//   - accepted/rejected steps
//   - minimum dt reached
//   - stop reason code
//   - RMS difference from reference output
//
// This lets you validate that a stage-refactor is mathematically equivalent (B)
// or detect a wrong refactor (C).
function main()
{
  // reset runtime knobs and budgets
  G_ShedTriggered = 0;
  G_Tol   = TOL;
  G_DtMax = DT_MAX;

  G_MaxRejects  = 2000000;
  G_DtMinHitMax = 500000;

  G_SteadyEps     = 1e-7;
  G_SteadyNeed    = 250;
  G_StateLimitRMS = 1e6;
  G_ObsLimitRMS   = 1e6;

  // reset stochastic/hybrid baselines
  G_SigmaBase       = 0.02;
  G_JumpLambdaBase  = 0.20;
  G_JumpMagBase     = 0.20;
  G_RegRate01       = 0.15;
  G_RegRate10       = 0.10;

  // A) reference RK tableau
  int   S = 0;
  float A[TRI_A];
  float b[MAXS];
  float c[MAXS];
  make_swappable_rk4(&S, A, b, c);

  // B) safe permutation: properly permute (A,b,c) into an equivalent method
  int   SB = 0;
  float AB[TRI_A];
  float bB[MAXS];
  float cB[MAXS];

  int p[MAXS];
  p[0]=0; p[1]=2; p[2]=1; p[3]=3;
  permute_method_into(S, A, b, c, p, &SB, AB, bB, cB);

  // C) buggy permutation: permute b/c but keep A unchanged (not equivalent in general)
  int   SW = S;
  float AW[TRI_A];
  float bW[MAXS];
  float cW[MAXS];

  int i;
  for(i=0;i<TRI_A;i++) AW[i] = A[i];
  for(i=0;i<MAXS;i++)  { bW[i]=0; cW[i]=0; }
  for(i=0;i<SW;i++)
  {
    int pi = p[i];
    bW[i] = b[pi];
    cW[i] = c[pi];
  }

  // initial condition for the hidden state y(0)
  int k;
  for(k=0;k<N;k++)
    *(Y0 + k) = 0.2*sin(0.31*(k+1)) + 0.15*cos(0.17*(k+1))
             + 0.05*sin(0.07*(k+1)*(k+1));

  var t0 = 0.0;

  // ----------- Seed generation for the custom RNG -----------
  // We build a seed from current wall-clock time with millisecond mixing so that
  // multiple launches close together still get different seeds.
  var now = wdate(NOW);                      // days since epoch with fractional day
  unsigned int MySeed = (unsigned int)utm(now); // seconds since 1970

  // mix in milliseconds extracted from fractional day
  var frac = now - floor(now);
  var ms = frac * 86400000.0; // 24*60*60*1000
  MySeed = 1664525u * MySeed + (unsigned int)ms + 1013904223u;

  if(MySeed == 0) MySeed = 1;
  printf("\nMySeed=%u", MySeed);
  // ---------------------------------------------------------

  // run stats for the three methods
  int stepsRef=0,  rejRef=0;   var dtMinRef=0;  int stopRef=0;
  int stepsSafe=0, rejSafe=0;  var dtMinSafe=0; int stopSafe=0;
  int stepsBug=0,  rejBug=0;   var dtMinBug=0;  int stopBug=0;

  // integrate using identical initial state + identical RNG seed
  integrate_adaptive_sde(S,  A,  b,  c,  t0, Y0, T_END, MySeed, Yref,
                         &stepsRef,  &rejRef,  &dtMinRef,  &stopRef);

  integrate_adaptive_sde(SB, AB, bB, cB, t0, Y0, T_END, MySeed, Ysafe,
                         &stepsSafe, &rejSafe, &dtMinSafe, &stopSafe);

  integrate_adaptive_sde(SW, AW, bW, cW, t0, Y0, T_END, MySeed, Ybug,
                         &stepsBug,  &rejBug,  &dtMinBug,  &stopBug);

  // compare final states to reference
  var diffSafe = err_norm(Ysafe, Yref);
  var diffBug  = err_norm(Ybug,  Yref);

  // summary prints
  printf("\nNeural SDE / Hybrid ODE integration (N=%d) to T=%g  baseTol=%g", N, T_END, TOL);
  printf("\nAdds: u(t), theta(t), diffusion noise, regime switching, jumps, partial observation, multiscale/stiff modes.");
  printf("\nRuntime knobs end-state: G_Tol=%g  G_DtMax=%g  shed=%d  mem=%d MB",
         G_Tol, G_DtMax, G_ShedTriggered, memMB());

  printf("\n\nA) Reference tableau:");
  printf("\n   accepted=%d rejected=%d minDt=%g stop=%d", stepsRef, rejRef, dtMinRef, stopRef);

  printf("\nB) Theorem-382A safe stage-refactor:");
  printf("\n   accepted=%d rejected=%d minDt=%g stop=%d", stepsSafe, rejSafe, dtMinSafe, stopSafe);
  printf("\n   diff to reference (RMS) = %.3e", diffSafe);

  printf("\nC) Buggy stage-refactor:");
  printf("\n   accepted=%d rejected=%d minDt=%g stop=%d", stepsBug, rejBug, dtMinBug, stopBug);
  printf("\n   diff to reference (RMS) = %.3e", diffBug);

  printf("\n\nStop codes: 0=tEnd, 1=steady-state, 2=event(blow-up), 3=rejectBudget, 4=dtMinBudget\n");

  quit("done");
}