YuleYule
Architecture

Security

Sandbox, Merkle verification, and authentication model

Process Sandbox

When yule serve starts (without --no-sandbox), the server process is placed in a Windows Job Object with:

  • Memory limit — 32GB default, prevents runaway allocations
  • No child spawningActiveProcessLimit = 1, the process can't fork or exec
  • Kill-on-close — if the Job Object handle is closed (crash, parent exit), the process is terminated
  • UI restrictions — clipboard, desktop switching, display settings, global atoms, user handles, system parameters, and write operations to the clipboard are all blocked

The sandbox is applied to the current process before the model is loaded, so even the parsing and weight loading phases run inside the sandbox.

Future Work

The current sandbox is in-process (Phase A). Phase B will implement a broker-target architecture:

  • Broker (main process) — parses CLI args, validates model, spawns target
  • Target (child process) — receives model file descriptor via IPC, runs inference, returns tokens
  • Full isolation — seccomp-bpf on Linux, AppContainer on Windows, seatbelt on macOS

Merkle Verification

At model load time, Yule builds a blake3 Merkle tree over all tensor data:

  1. The tensor payload (everything after the GGUF header) is split into 1MB chunks
  2. Each chunk is hashed with blake3
  3. Leaf hashes are combined into a binary Merkle tree
  4. The 256-bit root hash is stored in memory

This root hash appears in every /yule/chat response under integrity.model_merkle_root. You can verify it matches the hash from yule verify:

# on disk
yule verify ./model.gguf
# → merkle root: ffc7e1fd6016a6f9...

# from the API
curl -H "Authorization: Bearer $TOKEN" http://localhost:11434/yule/model
# → "merkle_root": "ffc7e1fd6016a6f9..."

If someone swaps a tensor in the model file, the Merkle root changes. If the API returns a different root than what you verified, the model has been tampered with.

Authentication

The API uses blake3-derived capability tokens:

  1. On startup, 32 bytes of OS entropy are collected via getrandom
  2. Token derivation: blake3(master_secret || counter || timestamp), truncated to 24 bytes, hex-encoded with yule_ prefix
  3. Only the blake3 hash of the token is stored — the server never keeps plaintext tokens in memory after generation
  4. Verification: hash the provided token, compare against stored hashes

Tokens look like: yule_b49913e2c05162951af4f87d62c2c9a6555eb91299c7fdcc

You can also provide your own token with --token, in which case its hash is stored the same way.

On this page