Language Models Text generation and chat models on ORGN Gateway — TEE and ZDR execution types and the AI SDK chatModel() method.
Language models handle text generation and chat: single-prompt completions, multi-turn conversations, reasoning, and tool use. They are the most common model type on Gateway.
Use a language model when you need to generate or transform text:
Chat assistants and multi-turn conversations
Single-prompt generation, summarization, and rewriting
Reasoning tasks (use a reasoning-capable model with reasoningEffort)
Code generation and analysis
Tool calling and structured output
For image input, see Vision . For turning speech into text, see Audio .
Language models are accessed with chatModel() and used with the AI SDK's generateText and streamText:
import { createOLLM } from '@orgn/gateway' ;
import { generateText } from 'ai' ;
const ollm = createOLLM ({ apiKey: process.env. OLLM_API_KEY });
const { text } = await generateText ({
model: ollm. chatModel ( 'near_glm_5_1' ),
prompt: 'What is ORGN Gateway?' ,
});
For streaming, system messages, multi-turn conversations, and reasoning options, see the Vercel AI SDK integration .
The legacy /v1/completions endpoint is not supported. Every completion task can be expressed as a chat call with chatModel().
Language models running in Trusted Execution Environments, on NEAR and Phala infrastructure with Intel TDX + NVIDIA H100 confidential compute. Every request produces a cryptographic attestation receipt.
Model Provider Infrastructure Context DeepSeek V3.1 DeepSeek near 128K DeepSeek V3.1 DeepSeek phala 164K GLM 4.7 ZAI near 205K GLM 4.7 ZAI phala 203K GLM 4.7 Flash ZAI phala 203K GLM 5 ZAI near 203K GLM 5.1 ZAI near 203K Kimi K2.5 Moonshot phala 262K GPT-OSS 120B OpenAI near 131K GPT-OSS 120B OpenAI phala 131K GPT-OSS 20B OpenAI phala 131K Qwen3 30B Alibaba near 262K Qwen3 30B Alibaba phala 262K Qwen 2.5 7B Alibaba phala 32K Qwen2.5 7B Instruct Alibaba phala 33K Qwen3.5 122B Alibaba near 131K Qwen3.5 27B Alibaba phala 262K Venice Uncensored 24B Venice phala 33K Gemma 3 27B Google phala 53K Llama 3.3 70B Meta phala 131K
Language models running on Vercel's AI infrastructure with zero data retention provider agreements. No attestation receipts are generated.
Model Context Claude 3 Haiku 200K Claude 3.5 Haiku 200K Claude 3.7 Sonnet 200K Claude Haiku 4.5 200K Claude Sonnet 4 1M Claude Sonnet 4.5 1M Claude Sonnet 4.6 1M Claude Opus 4 200K Claude Opus 4.1 200K Claude Opus 4.5 200K Claude Opus 4.6 1M Claude Opus 4.7 1M
Model Context GPT-4o 8K GPT-4o mini 8K GPT-4.1 8K GPT-4.1 mini 8K GPT-4.1 nano 1M GPT-5 400K GPT-5 mini 400K GPT-5 nano 400K GPT-5 Codex 400K GPT-5.1 Instant 128K GPT-5.1-Codex 400K GPT 5 Chat 128K GPT 5.1 Codex Max 400K GPT 5.1 Codex Mini 400K GPT 5.1 Thinking 400K GPT 5.2 400K GPT 5.2 Chat 128K GPT 5.2 Codex 400K GPT 5.3 Codex 400K GPT 5.4 1.1M GPT 5.4 Mini 400K GPT 5.4 Nano 400K GPT 5.4 Pro 1.1M GPT-OSS 20B 131K GPT-OSS 120B 131K GPT OSS Safeguard 20B 131K o1 200K o3-mini — o4-mini —
Model Context Gemini 2.0 Flash 1M Gemini 2.0 Flash-Lite 1M Gemini 2.5 Flash-Lite 1M Gemini 2.5 Flash 1M Gemini 2.5 Pro 1M Gemini 3 Flash 1M Gemini 3 Pro Preview 1M Gemini 3.1 Flash Lite Preview 1M Gemini 3.1 Pro Preview 1M Gemma 4 26B A4B IT 262K Gemma 4 31B IT 262K
Model Context Llama 3.1 8B 131K Llama 3.1 70B 131K Llama 3.2 1B 128K Llama 3.2 3B 128K Llama 3.3 70B 128K Llama 4 Scout 131K Llama 4 Maverick 524K
Model Context Mistral Small 32K Mistral Medium 128K Mistral Large 3 256K Mistral Nemo 131K Ministral 3B 128K Ministral 8B 128K Ministral 14B 256K Mixtral MoE 8x22B Instruct 66K Magistral Small 128K Magistral Medium 128K Codestral 128K Devstral 2 256K Devstral Small 128K Devstral Small 2 256K
Model Context Qwen 3 14B 41K Qwen 3 30B 41K Qwen 3 32B 131K Qwen 3 235B 131K Qwen3 235B Thinking 262K Qwen3 Coder 262K Qwen3 Coder 30B 262K Qwen3 Coder Next 256K Qwen3 Next 80B 262K Qwen 3.6 Plus 1M
Model Context DeepSeek R1 164K DeepSeek V3 164K DeepSeek V3.1 164K DeepSeek V3.2 164K
Model Context Kimi K2 131K Kimi K2 Turbo 256K Kimi K2 0905 256K Kimi K2 Thinking 262K Kimi K2 Thinking Turbo 262K Kimi K2.5 262K
Model Context GLM 4.6 205K GLM 4.7 205K GLM 4.7 Flash 200K GLM 5 203K GLM 5.1 203K
Model Provider Context MiniMax M2.1 MiniMax 205K MiniMax M2.5 MiniMax 205K Minimax M2.7 MiniMax 205K Morph V3 Fast Morph 82K Morph V3 Large Morph 82K INTELLECT 3 PrimeIntellect 131K Nemotron 3 Nano 30B NVIDIA 262K Nemotron Nano 9B v2 NVIDIA 131K NVIDIA Nemotron 3 Super 120B A12B NVIDIA 256K Nova 2 Lite Amazon 1M Nova Lite Amazon 300K Nova Micro Amazon 128K Nova Pro Amazon 300K
Several models in this catalog also accept image input. Models with vision capability are listed on the Vision page.