Gateway Architecture and Inference Flow
How ORGN Gateway routes OpenAI-compatible requests through TEE or ZDR execution environments — control plane, data plane, and attestation for verifiable confidential inference.
ORGN Gateway's architecture provides verifiable confidential inference when you choose TEE models, and policy zero retention when you choose ZDR models — while keeping model selection firmly with you.
Gateway separates request orchestration, secure execution, and verification. It does not perform automatic model selection or dynamic routing. The model specified in your request is executed.
High-level components
Client application
Your application sends requests using an OpenAI-compatible API, explicitly specifying the model to use.
The client:
- Selects the model in code or request parameters
- Sends prompts and inference parameters
- Receives model responses
- Receives attestation metadata on TEE requests (not on ZDR)
Gateway does not modify, override, or substitute the requested model.
Gateway router (control plane)
The Gateway router is a secure orchestration layer, responsible for:
- Authenticating requests (
sk-ollm-*API keys) - Validating model availability and permissions
- Enforcing security and execution constraints
- Coordinating attestation data for TEE routes
The router does not choose models, does not inspect prompt or response content, and does not perform inference.
Execution environments (data plane)
Gateway routes requests to one of two execution environments depending on the model ID you send.
TEE models (near_*, phala_*) run on NEAR and Phala infrastructure inside hardware-backed Trust Domains:
- Hardware-enforced memory isolation from host OS, hypervisor, and infrastructure
- Encryption in use via Intel TDX confidential VMs and NVIDIA H100 GPU attestation
- Cryptographic attestation receipt generated per request
ZDR models (vercel_*) run on Vercel AI Gateway infrastructure under zero data retention provider agreements:
- No storage or logging of prompts and responses by Vercel or the underlying model provider
- Broad access to frontier closed-weight and multimodal models
- No hardware isolation or attestation receipt
The model identifier you specify determines which environment is used. Gateway does not select or override the execution path.
Attestation and verification layer
For TEE model requests, the execution environment produces attestation artifacts that prove:
- The specified model ran inside a valid Trust Domain
- The execution environment matched expected measurements
- The response was generated within the trusted boundary
These artifacts are inspectable in ORGN Scanner, enabling independent verification of secure execution.
ZDR model requests do not produce attestation artifacts. Privacy is enforced through Vercel's zero data retention agreements with model providers.
Request lifecycle
The client sends a request to https://api.gateway.orgn.com/v1, explicitly specifying the model ID (for example near_glm_4_7 or vercel_claude_sonnet_4_6).
Gateway authenticates the request and verifies that the specified model is available and supported.
The request is forwarded to the selected model's execution environment — TEE (NEAR/Phala) or ZDR (Vercel).
For TEE models, hardware attestation data is generated as part of execution.
The model output is returned. TEE responses include verification metadata for Scanner.
Gateway does not alter your model choice. For TEE routes, prompts and responses remain inside the Trust Domain during inference — Gateway does not access plaintext inference content outside that boundary.
Trust boundaries and guarantees
| Guarantee | Detail |
|---|---|
| Model choice is user-controlled | No automatic routing or model substitution |
| Gateway does not retain inference content | Zero retention for prompts and outputs |
| Security depends on model tier | TEE: hardware + attestation. ZDR: policy retention via Vercel |
This architecture lets teams run sensitive LLM workloads with full control over model selection, choosing cryptographic proof (TEE) or frontier catalog access (ZDR) per request.
Related
- Security model — threat boundaries by execution type
- Models overview — TEE vs ZDR selection
- Attestation data reference — TEE receipt structure