Generate Themes from a Website with ORGN Gateway
Scrape a website and generate design themes using ORGN Gateway inference — example workflow with the OpenAI SDK.
Example workflow: analyze a company website and generate business themes using ORGN Gateway.
- Read
sitemap.xml - Filter relevant pages (
/services,/product,/platform) - Scrape text content
- Send combined content to a Gateway model
- Return structured themes
Only the theme-generation step uses Gateway. Scraping is standard HTTP — respect robots.txt and site terms.
Step 1: Read the sitemap
import requests
import xml.etree.ElementTree as ET
def get_relevant_urls(sitemap_url):
response = requests.get(sitemap_url)
root = ET.fromstring(response.content)
urls = []
for url in root.findall(".//{*}loc"):
link = url.text
if any(path in link for path in ["/services", "/product", "/platform"]):
urls.append(link)
return urlsStep 2: Scrape page content
from bs4 import BeautifulSoup
def scrape_page(url):
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
text = soup.get_text(separator=" ", strip=True)
return " ".join(text.split())Step 3: Generate themes via Gateway
from openai import OpenAI
client = OpenAI(
base_url="https://api.gateway.orgn.com/v1",
api_key="sk-ollm-YOUR_API_KEY"
)
def generate_themes(content):
response = client.chat.completions.create(
model="near_glm_4_7",
messages=[
{
"role": "system",
"content": "Extract high-level business themes from website content."
},
{
"role": "user",
"content": f"Analyze and extract themes:\n\n{content}"
}
]
)
return response.choices[0].message.contentnear_glm_4_7 is a TEE model — inference runs in a Trust Domain with optional Scanner attestation. For frontier models without hardware receipts, use a vercel_* ID instead.
Step 4: Full workflow
def run_analysis():
urls = get_relevant_urls("https://example.com/sitemap.xml")
combined_content = ""
for url in urls:
combined_content += "\n\n" + scrape_page(url)
themes = generate_themes(combined_content)
print(themes)Production considerations
- Chunk or truncate large content (
MAX_CHARS) to stay within context limits - Record
response.usage.total_tokensfor cost tracking - Handle HTTP and API errors gracefully
- For end-to-end confidential pipelines, run scraping and analysis inside CDE cloud worktrees
Related
Gateway API Requests and Response Structure
How ORGN Gateway structures API requests and responses — success envelopes, error handling, usage metadata, and TEE attestation receipts.
Troubleshoot Website Theme Generation
Fix sitemap, scraping, and Gateway API issues in the website theme generation workflow.