Configuration Directives
Core
listen_addr 0.0.0.0:8080;
redis_url redis://redis:6379;
Upstream
upstream_base_url https://api.openai.com;
upstream_api_key sk-xxxx;
Placeholder no-auth values:
dummy
none
null
-
Embeddings
embedding_base_url https://api.openai.com;
embedding_api_key sk-xxxx;
embedding_model text-embedding-3-small;
embedding_price 0.020;
Qdrant
qdrant_url http://qdrant:6334;
qdrant_api_key your-qdrant-key;
qdrant_collection aif_semantic_cache;
qdrant_vector_size 1536;
qdrant_vector_size must match the embedding model. Existing collections are validated at startup.
Cache lifecycle
cache_ttl_seconds 86400;
exact_cache_ttl_seconds 86400;
semantic_cache_retention_seconds 604800;
Request behavior
request_timeout_seconds 120;
max_request_body_bytes 1M;
Semantic cache
semantic_cache_enabled true;
semantic_cache_fail_open true;
semantic_similarity_threshold 0.92;
semantic_cache_fail_open applies to runtime lookup failures only, not startup initialization.
Model pricing
model_price gpt-4o-mini-2024-07-18 0.15 0.60;
allow_unknown_models_pass_through false;