Language Models

Text generation and chat models on ORGN Gateway — TEE and ZDR execution types and the AI SDK chatModel() method.

Language models handle text generation and chat: single-prompt completions, multi-turn conversations, reasoning, and tool use. They are the most common model type on Gateway.

When to Use

Use a language model when you need to generate or transform text:

Chat assistants and multi-turn conversations
Single-prompt generation, summarization, and rewriting
Reasoning tasks (use a reasoning-capable model with reasoningEffort)
Code generation and analysis
Tool calling and structured output

For image input, see Vision. For turning speech into text, see Audio.

AI SDK Method

Language models are accessed with chatModel() and used with the AI SDK's generateText and streamText:

language-model.ts

import { createOLLM } from '@orgn/gateway';
import { generateText } from 'ai';

const ollm = createOLLM({ apiKey: process.env.OLLM_API_KEY });

const { text } = await generateText({
  model: ollm.chatModel('near_glm_5_1'),
  prompt: 'What is ORGN Gateway?',
});

For streaming, system messages, multi-turn conversations, and reasoning options, see the Vercel AI SDK integration.

The legacy /v1/completions endpoint is not supported. Every completion task can be expressed as a chat call with chatModel().

Language models running in Trusted Execution Environments, on NEAR and Phala infrastructure with Intel TDX + NVIDIA H100 confidential compute, or on Tinfoil infrastructure with AMD SEV-SNP + NVIDIA confidential compute. Every request produces a cryptographic attestation receipt.

Model	Provider	Infrastructure	Context
DeepSeek V3.1	DeepSeek	near	128K
DeepSeek V3.1	DeepSeek	phala	164K
GLM 4.7	ZAI	near	205K
GLM 4.7	ZAI	phala	203K
GLM 4.7 Flash	ZAI	phala	203K
GLM 5	ZAI	near	203K
GLM 5.1	ZAI	near	203K
Kimi K2.5	Moonshot	phala	262K
GPT-OSS 120B	OpenAI	near	131K
GPT-OSS 120B	OpenAI	phala	131K
GPT-OSS 20B	OpenAI	phala	131K
Qwen3 30B	Alibaba	near	262K
Qwen3 30B	Alibaba	phala	262K
Qwen 2.5 7B	Alibaba	phala	32K
Qwen2.5 7B Instruct	Alibaba	phala	33K
Qwen3.5 122B	Alibaba	near	131K
Qwen3.5 27B	Alibaba	phala	262K
Venice Uncensored 24B	Venice	phala	33K
Gemma 3 27B	Google	phala	53K
Llama 3.3 70B	Meta	phala	131K

ZDR Catalog

Language models running on Vercel's AI infrastructure with zero data retention provider agreements. No attestation receipts are generated.

Anthropic

Model	Context
Claude 3 Haiku	200K
Claude 3.5 Haiku	200K
Claude 3.7 Sonnet	200K
Claude Haiku 4.5	200K
Claude Sonnet 4	1M
Claude Sonnet 4.5	1M
Claude Sonnet 4.6	1M
Claude Opus 4	200K
Claude Opus 4.1	200K
Claude Opus 4.5	200K
Claude Opus 4.6	1M
Claude Opus 4.7	1M

OpenAI

Model	Context
GPT-4o	8K
GPT-4o mini	8K
GPT-4.1	8K
GPT-4.1 mini	8K
GPT-4.1 nano	1M
GPT-5	400K
GPT-5 mini	400K
GPT-5 nano	400K
GPT-5 Codex	400K
GPT-5.1 Instant	128K
GPT-5.1-Codex	400K
GPT 5 Chat	128K
GPT 5.1 Codex Max	400K
GPT 5.1 Codex Mini	400K
GPT 5.1 Thinking	400K
GPT 5.2	400K
GPT 5.2 Chat	128K
GPT 5.2 Codex	400K
GPT 5.3 Codex	400K
GPT 5.4	1.1M
GPT 5.4 Mini	400K
GPT 5.4 Nano	400K
GPT 5.4 Pro	1.1M
GPT-OSS 20B	131K
GPT-OSS 120B	131K
GPT OSS Safeguard 20B	131K
o1	200K
o3-mini	—
o4-mini	—

Google

Model	Context
Gemini 2.0 Flash	1M
Gemini 2.0 Flash-Lite	1M
Gemini 2.5 Flash-Lite	1M
Gemini 2.5 Flash	1M
Gemini 2.5 Pro	1M
Gemini 3 Flash	1M
Gemini 3 Pro Preview	1M
Gemini 3.1 Flash Lite Preview	1M
Gemini 3.1 Pro Preview	1M
Gemma 4 26B A4B IT	262K
Gemma 4 31B IT	262K

Model	Context
Llama 3.1 8B	131K
Llama 3.1 70B	131K
Llama 3.2 1B	128K
Llama 3.2 3B	128K
Llama 3.3 70B	128K
Llama 4 Scout	131K
Llama 4 Maverick	524K

Mistral

Model	Context
Mistral Small	32K
Mistral Medium	128K
Mistral Large 3	256K
Mistral Nemo	131K
Ministral 3B	128K
Ministral 8B	128K
Ministral 14B	256K
Mixtral MoE 8x22B Instruct	66K
Magistral Small	128K
Magistral Medium	128K
Codestral	128K
Devstral 2	256K
Devstral Small	128K
Devstral Small 2	256K

Alibaba (Qwen)

Model	Context
Qwen 3 14B	41K
Qwen 3 30B	41K
Qwen 3 32B	131K
Qwen 3 235B	131K
Qwen3 235B Thinking	262K
Qwen3 Coder	262K
Qwen3 Coder 30B	262K
Qwen3 Coder Next	256K
Qwen3 Next 80B	262K
Qwen 3.6 Plus	1M

DeepSeek

Model	Context
DeepSeek R1	164K
DeepSeek V3	164K
DeepSeek V3.1	164K
DeepSeek V3.2	164K

Moonshot

Model	Context
Kimi K2	131K
Kimi K2 Turbo	256K
Kimi K2 0905	256K
Kimi K2 Thinking	262K
Kimi K2 Thinking Turbo	262K
Kimi K2.5	262K

ZAI

Model	Context
GLM 4.6	205K
GLM 4.7	205K
GLM 4.7 Flash	200K
GLM 5	203K
GLM 5.1	203K

Other Language Models

Model	Provider	Context
MiniMax M2.1	MiniMax	205K
MiniMax M2.5	MiniMax	205K
Minimax M2.7	MiniMax	205K
Morph V3 Fast	Morph	82K
Morph V3 Large	Morph	82K
INTELLECT 3	PrimeIntellect	131K
Nemotron 3 Nano 30B	NVIDIA	262K
Nemotron Nano 9B v2	NVIDIA	131K
NVIDIA Nemotron 3 Super 120B A12B	NVIDIA	256K
Nova 2 Lite	Amazon	1M
Nova Lite	Amazon	300K
Nova Micro	Amazon	128K
Nova Pro	Amazon	300K

Several models in this catalog also accept image input. Models with vision capability are listed on the Vision page.