About CheapTokenRouter
What we are
- A real-time router that hunts the lowest-cost GPU seats we can rent, then relays your request to whichever provider is cheapest at that moment.
- Up-front about the trade-offs: FP8 by default, zero certifications, zero SLAs, and some upstream providers may log or train on your data.
- One-click OpenAI-compatible endpoint that works as soon as you top up your account with USD.
- Extremely fast about adding new frontier open source models. Usually same day.
What we are not
- SOC 2, GDPR, HIPAA, or ISO-anything.
- A walled garden that logs everything you send. (We ourselves log only your raw tokens usage count, but we can't speak for every provider we route to.)
- A forever-stable service—prices, routes, and even upstream providers can change overnight.
Tips for using us
Best for
- Hobby projects
- General vibe coding
- MVPs
- Internal tools
- Workloads that can tolerate occasional hiccups
Not recommended for
- Regulated data
- Customer-facing production where uptime guarantees matter
- If you need a locked-down provider or full-precision FP16, caching, or any other special customizations
Roadmap (rough order)
- Reserved-capacity lanes for heavy users (no SLA, just better odds).
- Batch/async endpoints for even deeper discounts.
- Maybe—big maybe—an opt-in "enterprise tier" once we can afford audits.