API Overview

Authentication

All endpoints require a Bearer token in the Authorization header:

Authorization: Bearer yule_b49913e2c05162951af4f87d62c2c9a6555eb91299c7fdcc

The token is printed to stderr when the server starts. You can also provide your own token with --token.

Requests without a valid token receive 401 Unauthorized.

Two API Surfaces

Yule exposes two sets of endpoints:

Yule-Native (`/yule/*`)

Every response includes integrity proof: the model's Merkle root, verification status, and sandbox state. This is Yule's primary API. No other local inference server does this.

Method	Path	Description
GET	`/yule/health`	Server status, uptime, model info
GET	`/yule/model`	Full model metadata and Merkle root
POST	`/yule/chat`	Chat with integrity proof and timing
POST	`/yule/tokenize`	Tokenize text, return token IDs

OpenAI-Compatible (`/v1/*`)

Standard OpenAI format for drop-in compatibility with existing tools, SDKs, and UIs that speak the OpenAI protocol.

Method	Path	Description
POST	`/v1/chat/completions`	Chat completions (streaming + non-streaming)
GET	`/v1/models`	List loaded models

Why Both?

OpenAI-compatible endpoints exist so you can point any tool that supports a custom base URL at Yule without code changes. But they lose what makes Yule different: the integrity proof.

If you're building something that talks to Yule directly, use the Yule-native endpoints. You get Merkle roots, sandbox status, and timing data in every response.

API Overview

Authentication

Two API Surfaces

Yule-Native (/yule/*)

OpenAI-Compatible (/v1/*)

Why Both?

On this page

Yule-Native (`/yule/*`)

OpenAI-Compatible (`/v1/*`)