Private Beta

AI Gateway & Wallet Infrastructure

Route AI traffic, enforce prepaid usage, and monetize AI apps with one gateway.

App / Platform
Your SaaS
multi-tenant
API KeysTenants
Wallet
prepaid balance
Quotas
runtime limits
Relay
Relay Gateway
routing · metering · billing
FeesMeteringRouting
Providers
OpenAI
gpt-5, o-series
Anthropic
Claude
Google
Gemini
+ Custom
self-hosted
1.2M
tokens routed/min
$18.42
wallet balance
3.1%
platform fee
12ms
gateway p50
OpenAI compatible
Anthropic · Gemini
Stripe-ready billing
Multi-tenant runtime
The problem

AI usage is hard to monetize safely

Most teams ship AI features fast — then spend a year building the billing, quotas and routing they didn't plan for.

Unpredictable AI costs

Token spikes blow through margins overnight.

Flat-rate pricing risk

Power users consume 100× more than they pay.

Provider lock-in

Hard-coded SDKs make switching painful.

No runtime quotas

Nothing stops abuse between billing cycles.

No revenue sharing

Platforms can't take a margin on AI usage.

Billing complexity

Metering, top-ups and invoices become a roadmap.

The platform

Relay turns AI usage into infrastructure

Wallets, quotas, routing and fees behind a single OpenAI-compatible endpoint.

AI Gateway

One endpoint in front of every LLM provider.

Wallets & Top-Ups

Prepaid balances per tenant, user or workspace.

Runtime Quotas

Throttle by tokens, requests, cost or model class.

Provider Routing

Failover, fallbacks and routing rules at the edge.

Platform Fees

Add a margin or revenue share on every call.

OpenAI-Compatible API

Drop-in replacement — no SDK changes.

Usage Metering

Per-request cost attribution, exportable anywhere.

Multi-Tenant Keys

Issue scoped runtime keys per customer.

Billing modes

Two ways to monetize

Run Relay as a quota engine behind your existing billing, or hand it the wallet entirely.

Embedded Mode

Your app handles billing.
Relay enforces compute limits.

Plug Relay between your app and the LLMs. You keep Stripe, plans and seats — Relay enforces per-tenant token budgets and routes provider traffic.

  • SaaS AI agents
  • Internal copilots
  • Enterprise assistants
Direct Wallet Mode

Relay handles wallets,
top-ups and AI usage billing.

Full prepaid pipeline: hosted top-ups, balance enforcement, invoicing and fee splits. You ship product — Relay ships the cash register.

  • Shared AI assistants
  • AI developer platforms
  • Embedded AI products
Revenue sharing

Monetize AI usage without building billing infrastructure

Take a margin on every top-up — Relay handles the splits, refunds and accounting.

Customer top-up
$20.00
AI compute wallet
passed to provider
$18.00
Relay fee
infrastructure
$1.00
Platform fee
your revenue
$1.00
Built for AI-native SaaS platforms.
Developer experience

A drop-in OpenAI-compatible endpoint

Change the baseURL, issue a tenant key, ship. Quotas, routing and metering happen at the edge.

app/server.ts
TypeScript
// 1. Point any OpenAI SDK at Relay
const client = new OpenAI({
baseURL: "https://api.relaygateway.ai/v1",
apiKey:  "relay_tenant_xxx",
})

// 2. Relay enforces wallet + quotas, routes the call,
//    bills the tenant and splits platform fees.
await client.chat.completions.create({
model: "gpt-5",
messages: [{ role: "user", content: "Hello" }],
})
POST/v1/chat/completions200 OK · 412ms · $0.0042
Built for

Teams shipping AI as a product

AI SaaS
Agent platforms
Shared assistants
Internal copilots
AI developer tools
Multi-tenant AI apps
Documentation

Everything you need to ship

Add AI monetization infrastructure to your product.

Stop building billing pipelines. Ship AI features and let Relay handle the wallet, the quotas and the routing.