Pricing

Pay for routing. Save on inference.

A flat platform fee on top of provider-passthrough costs. The savings from routing typically exceed the fee by 10–20×.

For teams shipping their first AI product.

$99/month

For teams running AI in production.

$2,500/month

For regulated, high-volume deployments.

Custom

Compare

All plans, side-by-side.

Feature	Starter	Growth	Enterprise
Routed requests / mo	100k	5M	Unlimited
Models supported	100+	100+	100+ & custom
Semantic caching	✓	✓	✓
Continuous evals	—	✓	✓
SSO / SCIM / RBAC	—	✓	✓
Trace retention	7 days	90 days	365+ days
Data residency	US	EU + US	Any region
VPC / on-prem	—	—	✓
Compliance packs (HIPAA, FedRAMP)	—	—	✓
Support	Community	Slack, 4h	24×7, 30min
Uptime SLA	99.9%	99.95%	99.99%

Questions

Yes — provider token costs pass through at-cost. Inferly's fee is the platform charge. In practice the routing savings exceed our fee by 10–20×.

Yes. 1,000 routed requests per day, free, forever — no credit card. Plenty for prototyping and small side projects.

Overages are billed at $0.0001 per routed request beyond plan caps. You can set hard caps to block instead.

15% off Growth on annual, 20–25% off Enterprise on multi-year. Contact sales.

Yes — 60-day paid pilots at 50% list, convertible to annual.

No procurement cycle. Drop in. See the routing savings on your actual workload.