YuleYule
API

API Overview

Authentication, API surfaces, and design philosophy

Authentication

All endpoints require a Bearer token in the Authorization header:

Authorization: Bearer yule_b49913e2c05162951af4f87d62c2c9a6555eb91299c7fdcc

The token is printed to stderr when the server starts. You can also provide your own token with --token.

Requests without a valid token receive 401 Unauthorized.

Two API Surfaces

Yule exposes two sets of endpoints:

Yule-Native (/yule/*)

Every response includes integrity proof: the model's Merkle root, verification status, and sandbox state. This is Yule's primary API. No other local inference server does this.

MethodPathDescription
GET/yule/healthServer status, uptime, model info
GET/yule/modelFull model metadata and Merkle root
POST/yule/chatChat with integrity proof and timing
POST/yule/tokenizeTokenize text, return token IDs

OpenAI-Compatible (/v1/*)

Standard OpenAI format for drop-in compatibility with existing tools, SDKs, and UIs that speak the OpenAI protocol.

MethodPathDescription
POST/v1/chat/completionsChat completions (streaming + non-streaming)
GET/v1/modelsList loaded models

Why Both?

OpenAI-compatible endpoints exist so you can point any tool that supports a custom base URL at Yule without code changes. But they lose what makes Yule different: the integrity proof.

If you're building something that talks to Yule directly, use the Yule-native endpoints. You get Merkle roots, sandbox status, and timing data in every response.

On this page