Origin Docs

Create a chat completion

POST
/v1/chat/completions
AuthorizationBearer <token>

API key as a bearer token: Authorization: Bearer sk-ollm-<public_id>-<secret>. Required on the dev gateway; the prod gateway is open and ignores this header.

In: header

modelstring

Model or function selector. A plain name routes to that model. ollm::function_name::<name> routes to a configured function; ollm::model_name::<name> forces an explicit model.

messages

The conversation so far.

Items1 <= items
temperature?number

Sampling temperature.

Formatfloat
top_p?number

Nucleus sampling probability mass.

Formatfloat
max_tokens?integer

Maximum tokens to generate. If both this and max_completion_tokens are set, the smaller is used.

Formatint32
Range0 <= value
max_completion_tokens?integer

Maximum completion tokens. If both this and max_tokens are set, the smaller is used.

Formatint32
Range0 <= value
n?integer

Number of completions. Must be 1 (or omitted); other values are rejected with 400.

Default1
Formatint32
stop?array<string>

Up to N stop sequences. Array only — a bare string is not accepted.

presence_penalty?number
Formatfloat
frequency_penalty?number
Formatfloat
seed?integer

Deterministic sampling seed.

Formatint32
reasoning_effort?string

Reasoning effort hint (free-form; provider-dependent).

verbosity?string

Verbosity hint (free-form; provider-dependent).

service_tier?string
Default"auto"
Value in"auto" | "default" | "priority" | "flex"
stream?boolean

Stream partial deltas as SSE.

Defaultfalse
stream_options?
tools?

Tools the model may call.

tool_choice?string|

Controls tool selection.

parallel_tool_calls?boolean

Allow multiple tool calls in one turn.

response_format?||

Output format constraint.

ollm::tags?

Tags written to spend logs. source=playground normalizes to the playground tag; otherwise passed through verbatim. Falls back to the User-Agent header when empty.

ollm::cache_options?

Inference cache control.

ollm::dryrun?boolean

Skip billing and persistence.

ollm::variant_name?string

Pin a specific provider variant.

ollm::episode_id?string

Group related inferences into an episode.

Formatuuid
ollm::deny_unknown_fields?boolean

When true, any unknown top-level field returns 400.

Defaultfalse
ollm::include_raw_usage?boolean

Echo raw provider usage in the response. When streaming, requires stream_options.include_usage.

Defaultfalse
ollm::include_raw_response?boolean

Echo the raw upstream provider response.

Defaultfalse
[key: string]?any

Response Body

application/json

application/json

application/json

application/json

application/json

curl -X POST "https://api.gateway.orgn.com/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "near_qwen3_30b",    "messages": [      {        "role": "user",        "content": "Hello"      }    ]  }'
{
  "id": "string",
  "object": "chat.completion",
  "created": 0,
  "model": "string",
  "episode_id": "string",
  "system_fingerprint": "string",
  "service_tier": "string",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "string",
        "tool_calls": [
          {
            "id": "string",
            "type": "function",
            "function": {
              "name": "string",
              "arguments": "string"
            }
          }
        ]
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}
{
  "error": {
    "message": "string",
    "type": "string",
    "code": "string",
    "param": "string"
  }
}
{
  "error": {
    "message": "string",
    "type": "string",
    "code": "string",
    "param": "string"
  }
}
{
  "error": {
    "message": "string",
    "type": "string",
    "code": "string",
    "param": "string"
  }
}
{
  "error": {
    "message": "string",
    "type": "string",
    "code": "string",
    "param": "string"
  }
}
{
  "error": {
    "message": "string",
    "type": "string",
    "code": "string",
    "param": "string"
  }
}