AI Cost Firewall v0.2.2

VCAL Privacy Guard Orchestration Preview

AI Cost Firewall v0.2.2 introduces the first orchestration hooks for VCAL Privacy Guard.

This release extends the gateway beyond cost control and cache optimization by allowing sensitive text to be anonymized before it is sent to an upstream LLM provider, then restored before the response is returned to the client.

The goal of v0.2.2 is to provide a practical integration path for privacy-aware LLM gateway deployments.

The main improvements in v0.2.2 are:

optional VCAL Privacy Guard integration
pre-upstream anonymization flow
post-upstream restoration flow
mapping-based placeholder restoration
configurable fail-open or fail-closed guard behavior
API-key protected guard calls
stream rejection when privacy restoration is enabled
safer handling of non-string message content
configuration and deployment examples for integrated evaluation

Release Positioning

v0.2.2 positions AI Cost Firewall as the orchestrator for the VCAL guard layer.

The gateway remains OpenAI-compatible for client applications, while Privacy Guard can be placed behind AI Cost Firewall as an internal service.

Typical use cases include:

anonymizing emails, IP addresses, phone numbers, API keys, bearer tokens, JWTs, private keys, and similar sensitive values
sending placeholder-based prompts to upstream LLM providers
restoring placeholders in assistant responses before returning them to the client
evaluating privacy-aware LLM traffic patterns
preparing the foundation for future VCAL Security Guard, Compliance, Audit, and advanced Policy Guard integrations

Privacy Guard Flow

v0.2.2 supports the following high-level flow:

Client
  -> AI Cost Firewall
  -> VCAL Privacy Guard scan/anonymize
  -> Upstream OpenAI-compatible LLM
  -> VCAL Privacy Guard restore
  -> Client

Example original user content:

Analyze login from 185.23.10.5 by john@example.com

Example content sent upstream:

Analyze login from [IP_1] by [EMAIL_1]

Example upstream response:

[EMAIL_1] logged in from [IP_1]

Example restored response:

john@example.com logged in from 185.23.10.5

This mapping-based approach helps reduce sensitive-data exposure to the upstream provider while preserving useful context for the model.

New Privacy Guard Configuration

v0.2.2 adds optional Privacy Guard directives.

privacy_guard_enabled true;
privacy_guard_url http://vcal-privacy-guard:8090;
privacy_guard_api_key your-shared-api-key;
privacy_guard_mode anonymize;
privacy_guard_restore_enabled true;
privacy_guard_tenant_id default;
privacy_guard_policy_id default;
privacy_guard_timeout_seconds 10;
guard_fail_open false;

Directive Summary

Directive	Purpose
`privacy_guard_enabled`	Enables or disables Privacy Guard orchestration.
`privacy_guard_url`	Internal URL of the VCAL Privacy Guard service.
`privacy_guard_api_key`	Shared API key used by AI Cost Firewall when calling Privacy Guard.
`privacy_guard_mode`	Guard mode, typically `detect_only`, `redact`, or `anonymize`.
`privacy_guard_restore_enabled`	Restores placeholders in assistant responses before returning them to the client.
`privacy_guard_tenant_id`	Optional tenant identifier passed to Privacy Guard.
`privacy_guard_policy_id`	Optional policy identifier passed to Privacy Guard.
`privacy_guard_timeout_seconds`	Timeout for Privacy Guard scan and restore calls.
`guard_fail_open`	Controls whether guard failures are skipped or returned as request failures.

Recommended Privacy Defaults

For privacy-sensitive deployments, the recommended evaluation defaults are:

privacy_guard_enabled true;
privacy_guard_mode anonymize;
privacy_guard_restore_enabled true;
guard_fail_open false;

This means:

requests are anonymized before upstream forwarding
assistant responses are restored before returning to the client
guard failure blocks the request instead of silently forwarding sensitive data

For development-only environments, guard_fail_open true may be useful during early integration, but it is not recommended for privacy-sensitive production use.

Docker Compose Environment Example

environment:
  AIF_PRIVACY_GUARD_ENABLED: "true"
  AIF_PRIVACY_GUARD_URL: "http://vcal-privacy-guard:8090"
  AIF_PRIVACY_GUARD_MODE: "anonymize"
  AIF_PRIVACY_GUARD_RESTORE_ENABLED: "true"
  AIF_PRIVACY_GUARD_API_KEY: "change-me"
  AIF_GUARD_FAIL_OPEN: "false"

Both AI Cost Firewall and VCAL Privacy Guard should run on a shared private Docker network.

Non-String Message Content

v0.2.2 preserves non-string message content instead of attempting unsafe transformations.

When Privacy Guard is enabled, AI Cost Firewall collects string message content for scanning and restoration. Non-string content is preserved unchanged.

This behavior is important for compatibility with OpenAI-compatible message payloads that may include structured or multimodal content.

Recommended operational behavior:

preserve non-string content unchanged
emit a warning or metric when non-string content is skipped by the guard path
avoid converting structured content to strings implicitly
keep request ordering and roles stable

Streaming Behavior

When Privacy Guard restoration is enabled, streaming responses should be rejected.

Recommended behavior:

stream: true -> HTTP 422

Reason:

placeholder restoration requires the complete assistant message
partial streaming chunks may contain incomplete placeholders
restoring placeholders safely requires full-message context

Use non-streaming chat completions for Privacy Guard evaluation.

Contract Validation

v0.2.2 introduces stricter expectations for the AI Firewall ↔ Privacy Guard contract.

Recommended validation checks:

scanned message count matches collected string message count
restored message count matches collected assistant message count
message roles remain stable
message order remains unchanged
mapping IDs are preserved across scan and restore
failed guard responses respect guard_fail_open

These checks help avoid accidental data leakage or incorrect placeholder restoration.

Observability

v0.2.2 deployments should monitor both AI Cost Firewall and VCAL Privacy Guard metrics.

Recommended AI Cost Firewall checks:

curl http://localhost:8080/healthz
curl http://localhost:8080/readyz
curl http://localhost:8080/version
curl http://localhost:8080/metrics

Recommended Privacy Guard checks:

curl http://localhost:8090/healthz
curl http://localhost:8090/readyz
curl http://localhost:8090/version
curl http://localhost:8090/metrics

Recommended Privacy Guard metrics include:

scan request rate
restore request rate
findings by kind
actions by mode
active mappings
mapping creation and expiration
scan latency
restore latency
authentication failures

Deployment Checklist

Use this checklist for an integrated AI Cost Firewall + VCAL Privacy Guard deployment:

Start the AI Cost Firewall stack and confirm the default services are healthy.
Start VCAL Privacy Guard on the shared AI Firewall Docker network.
Configure VCAL Privacy Guard with an API key.
Configure AI Cost Firewall with matching Privacy Guard settings.
Recreate containers after configuration or environment changes.
Send a non-streaming request containing an email and IP address.
Confirm the upstream receives placeholders rather than original sensitive values.
Confirm the final client response is restored.
Send a stream: true request and confirm it is rejected when restoration is enabled.
Check AI Cost Firewall and Privacy Guard metrics.
Review Grafana dashboards for scan, restore, finding, mapping, and auth-failure visibility.

Upgrade Notes

When upgrading from v0.2.1 to v0.2.2:

Keep existing cache and gateway settings.
Add Privacy Guard directives only if you are evaluating the privacy orchestration path.
Ensure AI Cost Firewall and VCAL Privacy Guard share a private network.
Configure matching API keys on both services.
Prefer privacy_guard_mode anonymize for placeholder-based restoration.
Prefer guard_fail_open false for privacy-sensitive tests.
Recreate containers after changing config files or environment variables.
Use non-streaming requests when response restoration is enabled.
Validate metrics from both services after sending test traffic.

Known Non-Goals for v0.2.2

The following items remain outside the v0.2.2 scope:

full enterprise policy orchestration
full VCAL Security Guard integration
compliance-report generation
audit-export pipeline
administrative UI
multi-tenant management UI
native provider-specific configuration blocks
safe placeholder restoration for streaming chunks

Summary

AI Cost Firewall v0.2.2 adds the first practical orchestration path for VCAL Privacy Guard.

It keeps the OpenAI-compatible gateway model from v0.2.0 and the operational controls from v0.2.1, while adding privacy-aware request and response handling.

This release is recommended for controlled evaluations where teams want to test:

anonymization before upstream LLM calls
placeholder-based response restoration
fail-closed guard behavior
Privacy Guard metrics and dashboards
the foundation for a broader VCAL guard ecosystem

VCAL Privacy Guard Orchestration Preview​

Release Positioning​

Privacy Guard Flow​

New Privacy Guard Configuration​

Directive Summary​

Recommended Privacy Defaults​

Docker Compose Environment Example​

Non-String Message Content​

Streaming Behavior​

Contract Validation​

Observability​

Deployment Checklist​

Upgrade Notes​

Known Non-Goals for v0.2.2​

Summary​