02AI Infrastructure

OpenBili Relay

OpenAI-compatible aggregator + billing platform

Single endpoint for Claude / GPT / Gemini / Grok / 6 domestic models, with distribution, billing, and quota controls. Powers production traffic for dozens of downstream teams.

Engineering brief
Category
AI Infrastructure
Year
2026
State
Live in production
Stack
Python · FastAPIPostgreSQLRedisCloudflare Edge

OpenAI-compatible API surface unifying frontier and domestic LLMs behind one endpoint. Three-layer control plane: distribution (reseller hierarchy), billing (per-token cost accounting), quota (rate-limits per key). Production-grade — handles the model traffic for multiple downstream products including Lumira Atelier.

Powers our own production

Lumira Atelier itself runs every LLM call through OpenBili. The relay is dogfooded — its uptime is our uptime.

Outcome

Dozens of downstream teams in production. Pricing tier model copied by 2+ derivative aggregators.

Working on something similar?

Start with a 30-minute call. We pick one project together that fits a pilot.