Skip to main content

About CheapTokenRouter

What we are

  • A real-time router that hunts the lowest-cost GPU seats we can rent, then relays your request to whichever provider is cheapest at that moment.
  • Up-front about the trade-offs: FP8 by default, zero certifications, zero SLAs, and some upstream providers may log or train on your data.
  • One-click OpenAI-compatible endpoint that works as soon as you top up your account with USD.
  • Extremely fast about adding new frontier open source models. Usually same day.

What we are not

  • SOC 2, GDPR, HIPAA, or ISO-anything.
  • A walled garden that logs everything you send. (We ourselves log only your raw tokens usage count, but we can't speak for every provider we route to.)
  • A forever-stable service—prices, routes, and even upstream providers can change overnight.

Tips for using us

Best for
  • Hobby projects
  • General vibe coding
  • MVPs
  • Internal tools
  • Workloads that can tolerate occasional hiccups
Not recommended for
  • Regulated data
  • Customer-facing production where uptime guarantees matter
  • If you need a locked-down provider or full-precision FP16, caching, or any other special customizations

Roadmap (rough order)

  • Reserved-capacity lanes for heavy users (no SLA, just better odds).
  • Batch/async endpoints for even deeper discounts.
  • Maybe—big maybe—an opt-in "enterprise tier" once we can afford audits.

Talk to the humans who run the boxes

Contact us