Gateway Architecture and Inference Flow

How ORGN Gateway routes OpenAI-compatible requests through TEE or ZDR execution environments — control plane, data plane, and attestation for verifiable confidential inference.

ORGN Gateway's architecture provides verifiable confidential inference when you choose TEE models, and policy zero retention when you choose ZDR models — while keeping model selection firmly with you.

Gateway separates request orchestration, secure execution, and verification. It does not perform automatic model selection or dynamic routing. The model specified in your request is executed.

High-level components

Client application

Your application sends requests using an OpenAI-compatible API, explicitly specifying the model to use.

The client:

Selects the model in code or request parameters
Sends prompts and inference parameters
Receives model responses
Receives attestation metadata on TEE requests (not on ZDR)

Gateway does not modify, override, or substitute the requested model.

Gateway router (control plane)

The Gateway router is a secure orchestration layer, responsible for:

Authenticating requests (sk-ollm-* API keys)
Validating model availability and permissions
Enforcing security and execution constraints
Coordinating attestation data for TEE routes

The router does not choose models, does not inspect prompt or response content, and does not perform inference.

Execution environments (data plane)

Gateway routes requests to one of two execution environments depending on the model ID you send.

TEE models (near_*, phala_*, tinfoil_*) run on NEAR, Phala, and Tinfoil infrastructure inside hardware-backed Trust Domains:

Hardware-enforced memory isolation from host OS, hypervisor, and infrastructure
Encryption in use via Intel TDX confidential VMs and NVIDIA H100 GPU attestation (NEAR, Phala), or AMD SEV-SNP and NVIDIA confidential-compute GPU attestation (Tinfoil)
Cryptographic attestation receipt generated per request

ZDR models (vercel_*) run on Vercel AI Gateway infrastructure under zero data retention provider agreements:

No storage or logging of prompts and responses by Vercel or the underlying model provider
Broad access to frontier closed-weight and multimodal models
No hardware isolation or attestation receipt

The model identifier you specify determines which environment is used. Gateway does not select or override the execution path.

Attestation and verification layer

For TEE model requests, the execution environment produces attestation artifacts that prove:

The specified model ran inside a valid Trust Domain
The execution environment matched expected measurements
The response was generated within the trusted boundary

These artifacts are inspectable in ORGN Scanner, enabling independent verification of secure execution.

ZDR model requests do not produce attestation artifacts. Privacy is enforced through Vercel's zero data retention agreements with model providers.

Request lifecycle

Request submission

The client sends a request to https://api.gateway.orgn.com/v1, explicitly specifying the model ID (for example near_glm_4_7 or vercel_claude_sonnet_4_6).

Request validation

Gateway authenticates the request and verifies that the specified model is available and supported.

Inference execution

The request is forwarded to the selected model's execution environment — TEE (NEAR/Phala/Tinfoil) or ZDR (Vercel).

Attestation generation (TEE only)

For TEE models, hardware attestation data is generated as part of execution.

Response delivery

The model output is returned. TEE responses include verification metadata for Scanner.

Gateway does not alter your model choice. For TEE routes, prompts and responses remain inside the Trust Domain during inference — Gateway does not access plaintext inference content outside that boundary.

Trust boundaries and guarantees

Guarantee	Detail
Model choice is user-controlled	No automatic routing or model substitution
Gateway does not retain inference content	Zero retention for prompts and outputs
Security depends on model tier	TEE: hardware + attestation. ZDR: policy retention via Vercel

This architecture lets teams run sensitive LLM workloads with full control over model selection, choosing cryptographic proof (TEE) or frontier catalog access (ZDR) per request.

Security model — threat boundaries by execution type
Models overview — TEE vs ZDR selection
Attestation data reference — TEE receipt structure

Gateway Architecture and Inference Flow

On this page