Skip to main content

AI Cost Firewall v0.2.2

VCAL Privacy Guard Orchestration Preview

AI Cost Firewall v0.2.2 introduces the first orchestration hooks for VCAL Privacy Guard.

This release extends the gateway beyond cost control and cache optimization by allowing sensitive text to be anonymized before it is sent to an upstream LLM provider, then restored before the response is returned to the client.

The goal of v0.2.2 is to provide a practical integration path for privacy-aware LLM gateway deployments.

The main improvements in v0.2.2 are:

  • optional VCAL Privacy Guard integration
  • pre-upstream anonymization flow
  • post-upstream restoration flow
  • mapping-based placeholder restoration
  • configurable fail-open or fail-closed guard behavior
  • API-key protected guard calls
  • stream rejection when privacy restoration is enabled
  • safer handling of non-string message content
  • configuration and deployment examples for integrated evaluation

Release Positioning

v0.2.2 positions AI Cost Firewall as the orchestrator for the VCAL guard layer.

The gateway remains OpenAI-compatible for client applications, while Privacy Guard can be placed behind AI Cost Firewall as an internal service.

Typical use cases include:

  • anonymizing emails, IP addresses, phone numbers, API keys, bearer tokens, JWTs, private keys, and similar sensitive values
  • sending placeholder-based prompts to upstream LLM providers
  • restoring placeholders in assistant responses before returning them to the client
  • evaluating privacy-aware LLM traffic patterns
  • preparing the foundation for future VCAL Security Guard, Compliance, Audit, and advanced Policy Guard integrations

Privacy Guard Flow

v0.2.2 supports the following high-level flow:

Client
-> AI Cost Firewall
-> VCAL Privacy Guard scan/anonymize
-> Upstream OpenAI-compatible LLM
-> VCAL Privacy Guard restore
-> Client

Example original user content:

Analyze login from 185.23.10.5 by john@example.com

Example content sent upstream:

Analyze login from [IP_1] by [EMAIL_1]

Example upstream response:

[EMAIL_1] logged in from [IP_1]

Example restored response:

john@example.com logged in from 185.23.10.5

This mapping-based approach helps reduce sensitive-data exposure to the upstream provider while preserving useful context for the model.


New Privacy Guard Configuration

v0.2.2 adds optional Privacy Guard directives.

privacy_guard_enabled true;
privacy_guard_url http://vcal-privacy-guard:8090;
privacy_guard_api_key your-shared-api-key;
privacy_guard_mode anonymize;
privacy_guard_restore_enabled true;
privacy_guard_tenant_id default;
privacy_guard_policy_id default;
privacy_guard_timeout_seconds 10;
guard_fail_open false;

Directive Summary

DirectivePurpose
privacy_guard_enabledEnables or disables Privacy Guard orchestration.
privacy_guard_urlInternal URL of the VCAL Privacy Guard service.
privacy_guard_api_keyShared API key used by AI Cost Firewall when calling Privacy Guard.
privacy_guard_modeGuard mode, typically detect_only, redact, or anonymize.
privacy_guard_restore_enabledRestores placeholders in assistant responses before returning them to the client.
privacy_guard_tenant_idOptional tenant identifier passed to Privacy Guard.
privacy_guard_policy_idOptional policy identifier passed to Privacy Guard.
privacy_guard_timeout_secondsTimeout for Privacy Guard scan and restore calls.
guard_fail_openControls whether guard failures are skipped or returned as request failures.

For privacy-sensitive deployments, the recommended evaluation defaults are:

privacy_guard_enabled true;
privacy_guard_mode anonymize;
privacy_guard_restore_enabled true;
guard_fail_open false;

This means:

  • requests are anonymized before upstream forwarding
  • assistant responses are restored before returning to the client
  • guard failure blocks the request instead of silently forwarding sensitive data

For development-only environments, guard_fail_open true may be useful during early integration, but it is not recommended for privacy-sensitive production use.


Docker Compose Environment Example

environment:
AIF_PRIVACY_GUARD_ENABLED: "true"
AIF_PRIVACY_GUARD_URL: "http://vcal-privacy-guard:8090"
AIF_PRIVACY_GUARD_MODE: "anonymize"
AIF_PRIVACY_GUARD_RESTORE_ENABLED: "true"
AIF_PRIVACY_GUARD_API_KEY: "change-me"
AIF_GUARD_FAIL_OPEN: "false"

Both AI Cost Firewall and VCAL Privacy Guard should run on a shared private Docker network.


Non-String Message Content

v0.2.2 preserves non-string message content instead of attempting unsafe transformations.

When Privacy Guard is enabled, AI Cost Firewall collects string message content for scanning and restoration. Non-string content is preserved unchanged.

This behavior is important for compatibility with OpenAI-compatible message payloads that may include structured or multimodal content.

Recommended operational behavior:

  • preserve non-string content unchanged
  • emit a warning or metric when non-string content is skipped by the guard path
  • avoid converting structured content to strings implicitly
  • keep request ordering and roles stable

Streaming Behavior

When Privacy Guard restoration is enabled, streaming responses should be rejected.

Recommended behavior:

stream: true -> HTTP 422

Reason:

  • placeholder restoration requires the complete assistant message
  • partial streaming chunks may contain incomplete placeholders
  • restoring placeholders safely requires full-message context

Use non-streaming chat completions for Privacy Guard evaluation.


Contract Validation

v0.2.2 introduces stricter expectations for the AI Firewall ↔ Privacy Guard contract.

Recommended validation checks:

  • scanned message count matches collected string message count
  • restored message count matches collected assistant message count
  • message roles remain stable
  • message order remains unchanged
  • mapping IDs are preserved across scan and restore
  • failed guard responses respect guard_fail_open

These checks help avoid accidental data leakage or incorrect placeholder restoration.


Observability

v0.2.2 deployments should monitor both AI Cost Firewall and VCAL Privacy Guard metrics.

Recommended AI Cost Firewall checks:

curl http://localhost:8080/healthz
curl http://localhost:8080/readyz
curl http://localhost:8080/version
curl http://localhost:8080/metrics

Recommended Privacy Guard checks:

curl http://localhost:8090/healthz
curl http://localhost:8090/readyz
curl http://localhost:8090/version
curl http://localhost:8090/metrics

Recommended Privacy Guard metrics include:

  • scan request rate
  • restore request rate
  • findings by kind
  • actions by mode
  • active mappings
  • mapping creation and expiration
  • scan latency
  • restore latency
  • authentication failures

Deployment Checklist

Use this checklist for an integrated AI Cost Firewall + VCAL Privacy Guard deployment:

  1. Start the AI Cost Firewall stack and confirm the default services are healthy.
  2. Start VCAL Privacy Guard on the shared AI Firewall Docker network.
  3. Configure VCAL Privacy Guard with an API key.
  4. Configure AI Cost Firewall with matching Privacy Guard settings.
  5. Recreate containers after configuration or environment changes.
  6. Send a non-streaming request containing an email and IP address.
  7. Confirm the upstream receives placeholders rather than original sensitive values.
  8. Confirm the final client response is restored.
  9. Send a stream: true request and confirm it is rejected when restoration is enabled.
  10. Check AI Cost Firewall and Privacy Guard metrics.
  11. Review Grafana dashboards for scan, restore, finding, mapping, and auth-failure visibility.

Upgrade Notes

When upgrading from v0.2.1 to v0.2.2:

  1. Keep existing cache and gateway settings.
  2. Add Privacy Guard directives only if you are evaluating the privacy orchestration path.
  3. Ensure AI Cost Firewall and VCAL Privacy Guard share a private network.
  4. Configure matching API keys on both services.
  5. Prefer privacy_guard_mode anonymize for placeholder-based restoration.
  6. Prefer guard_fail_open false for privacy-sensitive tests.
  7. Recreate containers after changing config files or environment variables.
  8. Use non-streaming requests when response restoration is enabled.
  9. Validate metrics from both services after sending test traffic.

Known Non-Goals for v0.2.2

The following items remain outside the v0.2.2 scope:

  • full enterprise policy orchestration
  • full VCAL Security Guard integration
  • compliance-report generation
  • audit-export pipeline
  • administrative UI
  • multi-tenant management UI
  • native provider-specific configuration blocks
  • safe placeholder restoration for streaming chunks

Summary

AI Cost Firewall v0.2.2 adds the first practical orchestration path for VCAL Privacy Guard.

It keeps the OpenAI-compatible gateway model from v0.2.0 and the operational controls from v0.2.1, while adding privacy-aware request and response handling.

This release is recommended for controlled evaluations where teams want to test:

  • anonymization before upstream LLM calls
  • placeholder-based response restoration
  • fail-closed guard behavior
  • Privacy Guard metrics and dashboards
  • the foundation for a broader VCAL guard ecosystem