AI Cost Firewall v0.2.2
VCAL Privacy Guard Orchestration Preview
AI Cost Firewall v0.2.2 introduces the first orchestration hooks for VCAL Privacy Guard.
This release extends the gateway beyond cost control and cache optimization by allowing sensitive text to be anonymized before it is sent to an upstream LLM provider, then restored before the response is returned to the client.
The goal of v0.2.2 is to provide a practical integration path for privacy-aware LLM gateway deployments.
The main improvements in v0.2.2 are:
- optional VCAL Privacy Guard integration
- pre-upstream anonymization flow
- post-upstream restoration flow
- mapping-based placeholder restoration
- configurable fail-open or fail-closed guard behavior
- API-key protected guard calls
- stream rejection when privacy restoration is enabled
- safer handling of non-string message content
- configuration and deployment examples for integrated evaluation
Release Positioning
v0.2.2 positions AI Cost Firewall as the orchestrator for the VCAL guard layer.
The gateway remains OpenAI-compatible for client applications, while Privacy Guard can be placed behind AI Cost Firewall as an internal service.
Typical use cases include:
- anonymizing emails, IP addresses, phone numbers, API keys, bearer tokens, JWTs, private keys, and similar sensitive values
- sending placeholder-based prompts to upstream LLM providers
- restoring placeholders in assistant responses before returning them to the client
- evaluating privacy-aware LLM traffic patterns
- preparing the foundation for future VCAL Security Guard, Compliance, Audit, and advanced Policy Guard integrations
Privacy Guard Flow
v0.2.2 supports the following high-level flow:
Client
-> AI Cost Firewall
-> VCAL Privacy Guard scan/anonymize
-> Upstream OpenAI-compatible LLM
-> VCAL Privacy Guard restore
-> Client
Example original user content:
Analyze login from 185.23.10.5 by john@example.com
Example content sent upstream:
Analyze login from [IP_1] by [EMAIL_1]
Example upstream response:
[EMAIL_1] logged in from [IP_1]
Example restored response:
john@example.com logged in from 185.23.10.5
This mapping-based approach helps reduce sensitive-data exposure to the upstream provider while preserving useful context for the model.
New Privacy Guard Configuration
v0.2.2 adds optional Privacy Guard directives.
privacy_guard_enabled true;
privacy_guard_url http://vcal-privacy-guard:8090;
privacy_guard_api_key your-shared-api-key;
privacy_guard_mode anonymize;
privacy_guard_restore_enabled true;
privacy_guard_tenant_id default;
privacy_guard_policy_id default;
privacy_guard_timeout_seconds 10;
guard_fail_open false;
Directive Summary
| Directive | Purpose |
|---|---|
privacy_guard_enabled | Enables or disables Privacy Guard orchestration. |
privacy_guard_url | Internal URL of the VCAL Privacy Guard service. |
privacy_guard_api_key | Shared API key used by AI Cost Firewall when calling Privacy Guard. |
privacy_guard_mode | Guard mode, typically detect_only, redact, or anonymize. |
privacy_guard_restore_enabled | Restores placeholders in assistant responses before returning them to the client. |
privacy_guard_tenant_id | Optional tenant identifier passed to Privacy Guard. |
privacy_guard_policy_id | Optional policy identifier passed to Privacy Guard. |
privacy_guard_timeout_seconds | Timeout for Privacy Guard scan and restore calls. |
guard_fail_open | Controls whether guard failures are skipped or returned as request failures. |
Recommended Privacy Defaults
For privacy-sensitive deployments, the recommended evaluation defaults are:
privacy_guard_enabled true;
privacy_guard_mode anonymize;
privacy_guard_restore_enabled true;
guard_fail_open false;
This means:
- requests are anonymized before upstream forwarding
- assistant responses are restored before returning to the client
- guard failure blocks the request instead of silently forwarding sensitive data
For development-only environments, guard_fail_open true may be useful during early integration, but it is not recommended for privacy-sensitive production use.
Docker Compose Environment Example
environment:
AIF_PRIVACY_GUARD_ENABLED: "true"
AIF_PRIVACY_GUARD_URL: "http://vcal-privacy-guard:8090"
AIF_PRIVACY_GUARD_MODE: "anonymize"
AIF_PRIVACY_GUARD_RESTORE_ENABLED: "true"
AIF_PRIVACY_GUARD_API_KEY: "change-me"
AIF_GUARD_FAIL_OPEN: "false"
Both AI Cost Firewall and VCAL Privacy Guard should run on a shared private Docker network.
Non-String Message Content
v0.2.2 preserves non-string message content instead of attempting unsafe transformations.
When Privacy Guard is enabled, AI Cost Firewall collects string message content for scanning and restoration. Non-string content is preserved unchanged.
This behavior is important for compatibility with OpenAI-compatible message payloads that may include structured or multimodal content.
Recommended operational behavior:
- preserve non-string content unchanged
- emit a warning or metric when non-string content is skipped by the guard path
- avoid converting structured content to strings implicitly
- keep request ordering and roles stable
Streaming Behavior
When Privacy Guard restoration is enabled, streaming responses should be rejected.
Recommended behavior:
stream: true -> HTTP 422
Reason:
- placeholder restoration requires the complete assistant message
- partial streaming chunks may contain incomplete placeholders
- restoring placeholders safely requires full-message context
Use non-streaming chat completions for Privacy Guard evaluation.
Contract Validation
v0.2.2 introduces stricter expectations for the AI Firewall ↔ Privacy Guard contract.
Recommended validation checks:
- scanned message count matches collected string message count
- restored message count matches collected assistant message count
- message roles remain stable
- message order remains unchanged
- mapping IDs are preserved across scan and restore
- failed guard responses respect
guard_fail_open
These checks help avoid accidental data leakage or incorrect placeholder restoration.
Observability
v0.2.2 deployments should monitor both AI Cost Firewall and VCAL Privacy Guard metrics.
Recommended AI Cost Firewall checks:
curl http://localhost:8080/healthz
curl http://localhost:8080/readyz
curl http://localhost:8080/version
curl http://localhost:8080/metrics
Recommended Privacy Guard checks:
curl http://localhost:8090/healthz
curl http://localhost:8090/readyz
curl http://localhost:8090/version
curl http://localhost:8090/metrics
Recommended Privacy Guard metrics include:
- scan request rate
- restore request rate
- findings by kind
- actions by mode
- active mappings
- mapping creation and expiration
- scan latency
- restore latency
- authentication failures
Deployment Checklist
Use this checklist for an integrated AI Cost Firewall + VCAL Privacy Guard deployment:
- Start the AI Cost Firewall stack and confirm the default services are healthy.
- Start VCAL Privacy Guard on the shared AI Firewall Docker network.
- Configure VCAL Privacy Guard with an API key.
- Configure AI Cost Firewall with matching Privacy Guard settings.
- Recreate containers after configuration or environment changes.
- Send a non-streaming request containing an email and IP address.
- Confirm the upstream receives placeholders rather than original sensitive values.
- Confirm the final client response is restored.
- Send a
stream: truerequest and confirm it is rejected when restoration is enabled. - Check AI Cost Firewall and Privacy Guard metrics.
- Review Grafana dashboards for scan, restore, finding, mapping, and auth-failure visibility.
Upgrade Notes
When upgrading from v0.2.1 to v0.2.2:
- Keep existing cache and gateway settings.
- Add Privacy Guard directives only if you are evaluating the privacy orchestration path.
- Ensure AI Cost Firewall and VCAL Privacy Guard share a private network.
- Configure matching API keys on both services.
- Prefer
privacy_guard_mode anonymizefor placeholder-based restoration. - Prefer
guard_fail_open falsefor privacy-sensitive tests. - Recreate containers after changing config files or environment variables.
- Use non-streaming requests when response restoration is enabled.
- Validate metrics from both services after sending test traffic.
Known Non-Goals for v0.2.2
The following items remain outside the v0.2.2 scope:
- full enterprise policy orchestration
- full VCAL Security Guard integration
- compliance-report generation
- audit-export pipeline
- administrative UI
- multi-tenant management UI
- native provider-specific configuration blocks
- safe placeholder restoration for streaming chunks
Summary
AI Cost Firewall v0.2.2 adds the first practical orchestration path for VCAL Privacy Guard.
It keeps the OpenAI-compatible gateway model from v0.2.0 and the operational controls from v0.2.1, while adding privacy-aware request and response handling.
This release is recommended for controlled evaluations where teams want to test:
- anonymization before upstream LLM calls
- placeholder-based response restoration
- fail-closed guard behavior
- Privacy Guard metrics and dashboards
- the foundation for a broader VCAL guard ecosystem