API
Yule Endpoints
Native Yule API with integrity proof in every response
GET /yule/health
Server health check.
curl -H "Authorization: Bearer $TOKEN" http://localhost:11434/yule/health{
"status": "healthy",
"version": "0.1.0",
"uptime_seconds": 42,
"model": "tinyllama",
"architecture": "Llama",
"sandbox": true
}GET /yule/model
Full model metadata.
curl -H "Authorization: Bearer $TOKEN" http://localhost:11434/yule/model{
"name": "tinyllama",
"architecture": "Llama",
"parameters": "1.1B",
"context_length": 2048,
"embedding_dim": 2048,
"layers": 22,
"vocab_size": 32000,
"tensor_count": 201,
"merkle_root": "ffc7e1fd6016a6f9ba2ca390a43681453a46ec6054f431aeb6244487932b0e65"
}POST /yule/chat
Chat completion with integrity proof.
Request
{
"messages": [
{ "role": "system", "content": "You are helpful." },
{ "role": "user", "content": "Hello" }
],
"max_tokens": 100,
"temperature": 0.7,
"top_p": 0.9,
"stream": false
}All fields except messages are optional. Defaults: max_tokens: 512, temperature: 0.7, top_p: 0.9, stream: false.
Response
{
"id": "yule-18947c080149320c",
"text": "Hello! How can I help you today?",
"finish_reason": "stop",
"usage": {
"prompt_tokens": 15,
"completion_tokens": 8,
"total_tokens": 23
},
"integrity": {
"model_merkle_root": "ffc7e1fd6016a6f9ba2ca390a43681453a46ec6054f431aeb6244487932b0e65",
"model_verified": true,
"sandbox_active": true
},
"timing": {
"prefill_ms": 1200.5,
"decode_ms": 2400.3,
"tokens_per_second": 3.33
}
}The integrity and timing fields are what make this endpoint different from OpenAI-compatible. The Merkle root lets you verify the loaded model matches the one you verified with yule verify.
Streaming
Set "stream": true to get SSE events. See Streaming.
POST /yule/tokenize
Tokenize text without running inference.
Request
{
"text": "Hello world"
}Response
{
"tokens": [15043, 3186],
"count": 2
}