OpenAI-Compatible Providers
AI Cost Firewall supports practical OpenAI-compatible chat and embedding endpoints while keeping the flat configuration model.
Supported patterns
| Provider | Typical role |
|---|---|
| OpenAI | Chat and embeddings |
| Ollama | Local chat and embeddings |
| LM Studio | Local desktop inference |
| vLLM | Self-hosted GPU inference |
| LiteLLM | Aggregation/proxy layer |
| OpenRouter | Multi-provider upstream |
upstream_provider openai_compatible;
upstream_base_url <base-url>;
upstream_api_key <key-or-placeholder>;
embedding_provider openai_compatible;
embedding_base_url <base-url>;
embedding_api_key <key-or-placeholder>;
The base URL may be either the provider root URL or its /v1 base path:
https://api.openai.com
https://api.openai.com/v1
http://ollama:11434
http://ollama:11434/v1
http://lmstudio:1234/v1
http://vllm:8000/v1
http://litellm:4000/v1
Do not configure the full endpoint path:
# Wrong
upstream_base_url http://ollama:11434/v1/chat/completions;
# Correct
upstream_base_url http://ollama:11434/v1;
For local providers without authentication, use a placeholder key:
upstream_api_key dummy;
embedding_api_key dummy;
Accepted placeholder values are dummy, none, null, and -.
Deployment examples
Runnable examples are available under:
deploy/examples/
Recommended examples:
openai-cloud/
local-ollama/
hybrid-openai-local-embeddings/
openrouter/
local-full-stack/