Skip to main content

Configuration Directives

Core

listen_addr 0.0.0.0:8080;
redis_url redis://redis:6379;

Upstream

upstream_base_url https://api.openai.com;
upstream_api_key sk-xxxx;

Placeholder no-auth values:

dummy
none
null
-

Embeddings

embedding_base_url https://api.openai.com;
embedding_api_key sk-xxxx;
embedding_model text-embedding-3-small;
embedding_price 0.020;

Qdrant

qdrant_url http://qdrant:6334;
qdrant_api_key your-qdrant-key;
qdrant_collection aif_semantic_cache;
qdrant_vector_size 1536;

qdrant_vector_size must match the embedding model. Existing collections are validated at startup.

Cache lifecycle

cache_ttl_seconds 86400;
exact_cache_ttl_seconds 86400;
semantic_cache_retention_seconds 604800;

Request behavior

request_timeout_seconds 120;
max_request_body_bytes 1M;

Semantic cache

semantic_cache_enabled true;
semantic_cache_fail_open true;
semantic_similarity_threshold 0.92;

semantic_cache_fail_open applies to runtime lookup failures only, not startup initialization.

Model pricing

model_price gpt-4o-mini-2024-07-18 0.15 0.60;
allow_unknown_models_pass_through false;