mTLS (Design & Guide)
Quick links
- How‑to: Enable mTLS → Enable mTLS for CRUD Service APIs
- Jump to: Flow: mode and PoP enforcement
Mode selection (quick visual)
At a glance
mTLS Implementation Guide
This guide explains what we built, how it works end-to-end, how to configure it (Docker/Kubernetes), and how to use and verify it.
What we built
- Inbound mutual TLS (mTLS) at the edge (Traefik/Nginx) with the verified client certificate forwarded to the app via a trusted header.
- FastAPI middleware parses the forwarded certificate, computes x5t#S256, and attaches identity details to
request.state. - Authentication service supports modes:
bearer,mtls,bearer_plus_mtls_optional,bearer_plus_mtls_requiredwith PoP (sender-binding) enforcement usingcnf.x5t#S256. - Identity mapping produces canonical ARNs like
auth:account:x509:sha256:{thumb}or email SAN-based ARNs under policy. - Identity mapping produces canonical ARNs like
auth:account:x509:sha256:{thumb}or email SAN-based ARNs under policy. For JWT-derived identities, the provider segment in ARNs prefers the IdPprovideralias (falls back to entryname) so identities remain stable across audiences of the same issuer. - Vault secrets path enforces sender-binding and drift detection.
How it works (at a glance)
- Edge terminator verifies client cert (CA/OCSP/CRL) and forwards the PEM to the app.
- Middleware trusts only configured proxy CIDRs, parses PEM, computes thumbprint, and sets
request.state.mtls_thumbprintand other metadata. - AuthN enforces mode and PoP: compares
JWT.cnf.x5t#S256tomtls_thumbprintwhen required. - VaultService binds secrets to the thumbprint (or
cnf_jkt) and detects drift across requests.
Key diagrams are further below in this document (sequence and flow).
Here’s a production-grade design to add inbound mTLS and certificate-based authentication, integrated cleanly with your existing auth flow and policies.
Current state (deep assessment)
authenticate_userinsrc/services/authn_service.pyaccepts only Bearer tokens; no client cert parsing or sender-binding checks.workflow_routes.pyandexecute_routes.pyrely onDepends(authorize_user)→ still Bearer-only.main.pycan run HTTPS but does not support client auth (no CA, no verify_mode).- No middleware parses
X-Forwarded-Client-Cert/XFCC or attaches cert-derived identity. OnlyPrincipalHeadersMiddlewareexists. VaultServicealready anticipatesrequest.state.mtls_thumbprint/cnf_jktfor secret policy binding, but nothing sets these today.
Goals
- Robust inbound mTLS for service-to-service/API-clients.
- Cert-based identity and optional PoP binding to JWT (
cnf.x5t#S256). - Zero-trust header handling (only trust proxy-provided cert headers).
- FIPS-appropriate TLS policy and revocation handling.
- Clear modes: bearer-only, mtls-only, or strict “bearer + mtls binding required”.
Trust boundaries and deployment options
- Edge-terminated mTLS at Traefik/Nginx and forward a trusted client-cert header to the app.
- Optional direct mTLS at the app (Uvicorn with client cert verification), but prefer proxy termination for standardization and revocation checks.
Target architecture
-
Reverse proxy (Traefik) enforces mTLS:
- Require client certificate (RequireAndVerify).
- Validate against trusted CA bundle (config-managed).
- Enable
passTLSClientCertto forward the peer cert viaX-Forwarded-Tls-Client-Cert(PEM) and optionally a SHA-256 thumbprint header. - Enforce TLS1.2+ (prefer TLS1.3 when available) and FIPS-approved ciphers.
-
App (CRUDService) adds:
ClientCertExtractionMiddleware:- Trusts cert headers only from known internal proxies.
- Parses forwarded PEM cert, computes SHA-256 thumbprint (DER), extracts subject DN, SANs, issuer, validity, policy OIDs.
- Stores on
request.state:mtls_thumbprint(base64url x5t#S256),client_cert_subject_dn,client_cert_sans,client_cert_issuer_dn,client_cert_not_before,client_cert_not_after,client_cert_policy_oids.
authn_service.authenticate_request:- New
INBOUND_AUTH_MODEoptions:bearer,mtls,bearer_plus_mtls_optional,bearer_plus_mtls_required. - If
mtlsorbearer_plus_mtls_*, build a canonical identity from cert (see “Identity mapping”). - If
bearer_plus_mtls_required, enforcetoken.cnf.x5t#S256 == mtls_thumbprint. - Set
request.state.mtls_thumbprint,request.state.client_id, andrequest.state.subjectsoVaultServiceand downstream components can apply sender-binding.
- New
authorize_usercontinues to work; it will see aunique_idderived from cert when JWT is absent or both identities are present (config determines which wins).
Identity mapping (canonical and policy-driven)
- Canonical ARNs: use standard
auth:account:{provider}:{subject}. - Certificate identities:
- Default provider:
x509. - Subject resolution strategies (configurable):
- Thumbprint-based:
auth:account:x509:sha256:{x5t#S256}(strongest/unique). - SAN e-mail:
auth:account:x509:email:{addr}when email SAN present and allowed. - SAN URI / DNS / UPN mapping rules.
- Thumbprint-based:
- Default provider:
- Mapping policy file (config-managed, mounted at
/app/config/cert_identity_mappings.yaml):- Static allowlist: specific thumbprints → principal ARNs.
- CA constraints: which CAs are allowed per endpoint group.
- Policy OIDs required (e.g., for device or hardware-token client certs).
- SAN/domain constraints (e.g.,
email SAN must end with @corp.tld). - Endpoint-level mode: which endpoints accept
mtls, requirebearer+mtls, or arebeareronly.
Sender-binding and PoP enforcement
- If mTLS and JWT present:
- Verify
JWT.cnf.x5t#S256 == request.state.mtls_thumbprint; otherwise 401 with explicit error.
- Verify
- If DPoP is used instead of mTLS in some flows, keep your existing DPoP path. Prefer one PoP mechanism per client profile to reduce complexity.
Revocation checks and cert validation
- Primary: enforce mTLS at Traefik with CRL/OCSP.
- Configure OCSP stapling and revocation at the proxy (preferred for performance and consistency).
- Secondary (defense-in-depth): app validates certificate fields from header against mappings (expiry, policy OIDs, SAN restrictions).
- Optionally implement light OCSP/CRL verification in app for high-risk endpoints (cache responses, short TTLs) if the proxy cannot enforce revocation.
Configuration (centralized, docker-compose-driven)
- New security section (single source of truth in mounted config):
- Trusted proxies/CIDRs.
- Expected forwarded cert header names.
- Mode per endpoint group:
bearer,mtls,bearer_plus_mtls_optional,bearer_plus_mtls_required. - Allowed CAs (hashes or filenames), policy OIDs required, SAN constraints.
- Enable/disable PoP binding enforcement.
- Traefik dynamic config:
- TLS client auth (CA bundle path).
passTLSClientCertwithpem: trueandinfo.notAfter/notBefore/subject/issuer/sansenabled as needed.- Strict TLS versions and FIPS-compliant ciphers.
App changes (concrete)
- Add
src/middleware/client_cert.py:- Verify caller is a trusted proxy (
remote_addrin CIDRs or a shared internal mTLS), otherwise ignore/strip client-cert headers. - Read
X-Forwarded-Tls-Client-Cert(PEM), parse viacryptography, compute x5t#S256 (base64url of SHA-256 DER), extract SANs/DNs, setrequest.state.*.
- Verify caller is a trusted proxy (
- Wire it in
main.pybeforePrincipalHeadersMiddleware. - Extend
src/services/authn_service.py:- Load auth mode and mapping policies from config loader.
- Build identity from cert when configured; verify PoP binding when required.
- If cert-only mode: bypass JWT introspection; set
unique_identityfrom cert mapping. - If bearer+mtls: keep JWT introspection, then enforce binding.
- Extend error responses with clear troubleshooting (e.g., missing cert, untrusted CA, expired cert, PoP mismatch).
- Kafka audit:
- On cert-auth success/failure publish security event with minimal non-sensitive details:
cert_thumbprint,issuer_hash,policy_oids, endpoint, correlation_id.
- On cert-auth success/failure publish security event with minimal non-sensitive details:
- Metrics:
- Counters for
mtls_auth_success,mtls_auth_failure_{reason},mtls_pop_mismatch.
- Counters for
Security hardening
- Strip any client-cert header from untrusted sources (defense-in-depth).
- Enforce strict header names; reject multi-value tricks.
- Rate-limit cert-auth failures.
- Rotate CA bundles via mounted config; reload on SIGHUP or periodic watch.
- TLS policy: TLSv1.2+ (prefer TLSv1.3); FIPS-approved ciphers only.
- No silent fallback between modes. If
bearer_plus_mtls_required, return 401 when cert missing or binding fails.
Rollout plan
- Phase 1 (observe): Deploy proxy mTLS termination and pass cert headers; add middleware in passive mode to parse/attach, but keep
bearermode. Monitor. - Phase 2 (optional binding): Switch to
bearer_plus_mtls_optionaland log binding success/failure; alert on mismatches. - Phase 3 (enforce): Flip endpoints to
bearer_plus_mtls_requiredormtlsas needed. - Add runbooks for CA rotation and incident response (revocation, blocklists).
Optional: direct app mTLS (only if needed)
- Build
ssl.SSLContextwithssl.Purpose.CLIENT_AUTH,verify_mode=CERT_REQUIRED,load_verify_locations(ca),minimum_version=TLSv1_2(or TLSv1_3), FIPS ciphers. - Pass
ssl=ctxto uvicorn programmatically (instead of env-only run). - Still keep proxy as preferred termination for simplicity and revocation.
If you want, I can implement:
-
ClientCertExtractionMiddleware. -
Config schema changes (security section, mapping file).
-
authenticate_requestupdates with modes and PoP binding checks. -
Traefik dynamic config snippet to enable mTLS and pass client cert safely.
-
Implementing this will also populate
request.state.mtls_thumbprint, unlocking the sender-binding already recognized byVaultService. -
This design aligns with your canonical ARN format, avoids unsafe fallbacks, and keeps configuration centralized via the mounted config directory.
-
Summary
-
Inbound mTLS terminated at Traefik with strict TLS and revocation; cert forwarded via trusted header.
-
New FastAPI middleware parses cert, computes x5t#S256, and attaches identity details.
-
authenticate_requestgains configurable modes (bearer, mtls, bearer+mtls with PoP binding). -
Identity mapping from cert to canonical ARNs via policy file; strict enforcement, no silent fallback.
-
Full auditing and metrics; FIPS-appropriate TLS posture.
Yes. The design fits your current split: Docker + Traefik in dev, Kubernetes + ingress (Traefik or Nginx) in prod. Here’s exactly how it maps to your docker-compose-authzen4.yml and what to do in K8s.
Dev (Docker + Traefik) – concrete changes
- Traefik already fronts all services in
docker-compose-authzen4.yml. Add mTLS and forward the client cert to the app. - Use Traefik dynamic config (
CRUDService/traefik/dynamic.yml) and the existing certs mount (../IdP/certs → /certs).
Minimal dynamic config additions:
# traefik/dynamic.yml
tls:
options:
mtls:
minVersion: VersionTLS12
cipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
clientAuth:
clientAuthType: RequireAndVerifyClientCert
caFiles:
- /certs/ca.crt
http:
middlewares:
mtls-passcert:
passTLSClientCert:
pem: true
info:
subject: true
issuer: true
notBefore: true
notAfter: true
sans: true
Attach to your CRUD router (labels) so the cert is required and forwarded:
# in crud-service.labels (add/keep existing)
- "traefik.http.routers.crud.tls=true"
- "traefik.http.routers.crud.tls.options=mtls@file"
- "traefik.http.routers.crud.middlewares=mtls-passcert@file,security-headers@file,rate-limit@file"
App-side settings (env or config file under /app/config):
INBOUND_AUTH_MODE=bearer_plus_mtls_required(ormtls/bearer_plus_mtls_optionalper endpoint policy)FORWARDED_CLIENT_CERT_HEADER=X-Forwarded-Tls-Client-CertTRUSTED_PROXY_CIDRS=172.16.0.0/12,10.0.0.0/8(Docker bridge)CERT_IDENTITY_MAPPINGS=/app/config/cert_identity_mappings.yaml
The new ClientCertExtractionMiddleware will:
- Trust only Traefik (by
TRUSTED_PROXY_CIDRS). - Parse
X-Forwarded-Tls-Client-Cert(PEM), compute x5t#S256, setrequest.state.mtls_thumbprint, SANs, DNs, validity. authenticate_requestwill enforce PoP binding if configured and populate canonical identity. This automatically unlocks sender-binding inVaultService(it already readsmtls_thumbprint).
Prod (Kubernetes)
You can run either Traefik or Nginx. The app’s middleware stays identical; only ingress config differs.
Option A: Traefik (CRDs)
- In your
IngressRoute:- reference a
tls.options: mtls@filewithclientAuth(mounted CA secret) - apply the same
passTLSClientCertmiddleware
- reference a
Example fragment:
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
spec:
entryPoints: [ websecure ]
routes:
- match: Host(`crud.prod.example.com`)
kind: Rule
services:
- name: crud-service
port: 8000
middlewares:
- name: mtls-passcert
tls:
options:
name: mtls
namespace: traefik
Option B: Nginx Ingress
- Enable mTLS and forward the cert to upstream:
metadata:
annotations:
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/auth-tls-secret: "default/mtls-ca" # CA bundle
nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
nginx.ingress.kubernetes.io/proxy-ssl-name: "crud.prod.example.com"
- Configure the app to accept the Nginx header (e.g.,
ssl-client-cert); setFORWARDED_CLIENT_CERT_HEADER=ssl-client-cert.
Config, policy, and CA management
- Keep all app settings in
ServiceConfigs/CRUDService/config(mounted at/app/config) per your standard. - Store CA bundles/secrets in:
- dev:
../IdP/certs(already mounted to Traefik as/certs) - prod: K8s
Secretreferenced by Ingress (Traefik/Nginx)
- dev:
- Identity mapping file:
ServiceConfigs/CRUDService/config/cert_identity_mappings.yamlto constrain SAN/issuers/policy OIDs and map certs to canonical ARNs.
What this gives you
-
Dev and prod use the same application behavior; only the edge differs.
-
Traefik/Nginx terminate mTLS and forward the client cert. The app enforces PoP binding and builds cert-based identity.
VaultServicesender-binding starts working immediately viarequest.state.mtls_thumbprint. -
I can proceed to:
- add
ClientCertExtractionMiddleware, - extend
authenticate_requestwith the modes and PoP-binding, - introduce new config keys and a sample
cert_identity_mappings.yaml, - and leave Traefik dynamic config snippets ready in
CRUDService/traefik/dynamic.yml.
- add
-
The docker-compose and K8s specifics above ensure the design is fully aligned with your current architecture.
Implementation To-Do (detailed)
-
Prerequisites and decisions
- Define trusted proxy CIDRs for dev/prod (Docker bridge, cluster CIDRs) and the exact forwarded cert header name per ingress:
- Dev (Traefik):
X-Forwarded-Tls-Client-Cert(PEM) - Nginx Ingress:
ssl-client-cert(PEM)
- Dev (Traefik):
- Prepare CA bundle(s) for client auth:
- Dev: place CA at
../IdP/certs/ca.crt(already mounted to Traefik as/certs) - Prod: create K8s Secret (e.g.,
default/mtls-ca) with CA chain
- Dev: place CA at
- Decide inbound auth mode per endpoint group:
bearer,mtls,bearer_plus_mtls_optional,bearer_plus_mtls_required - Author identity mapping rules and constraints (SAN, issuer, policy OIDs) in
ServiceConfigs/CRUDService/config/cert_identity_mappings.yaml
- Define trusted proxy CIDRs for dev/prod (Docker bridge, cluster CIDRs) and the exact forwarded cert header name per ingress:
-
Reverse proxy (Dev – Traefik)
- Update
CRUDService/traefik/dynamic.yml:- Add
tls.options.mtlswithclientAuth.RequireAndVerifyClientCertandcaFiles: [/certs/ca.crt] - Add middleware
mtls-passcertwithpassTLSClientCert(pem: true, include subject/issuer/notBefore/notAfter/sans)
- Add
- Update
crud-servicelabels indocker-compose-authzen4.yml:- Attach
traefik.http.routers.crud.tls.options=mtls@file - Append
mtls-passcert@fileto the routermiddlewares
- Attach
- Reload Traefik (compose restart) and verify dashboard shows TLS client auth enabled for
crudrouter
- Update
-
Reverse proxy (Prod – Ingress)
- If Traefik: create
IngressRoutereferencingtls.options mtlsand applypassTLSClientCertmiddleware; mount CA secret in Traefik - If Nginx Ingress: set annotations
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"nginx.ingress.kubernetes.io/auth-tls-secret: "<ns>/mtls-ca"nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
- Document header name for app (
FORWARDED_CLIENT_CERT_HEADER) and ensure it matches ingress behavior
- If Traefik: create
Per-path binding enforcement
- Set global mode to
bearer_plus_mtls_optionaland list paths that require PoP binding viaBINDING_REQUIRED_PATHS:
INBOUND_AUTH_MODE=bearer_plus_mtls_optional
BINDING_REQUIRED_PATHS=/workflow/start,/workflow/resume,/execute
- Behavior:
- Requests to listed paths must include a client cert (forwarded) and the JWT must contain
cnf.x5t#S256matching the cert thumbprint. Otherwise 401. - Non-listed paths in optional mode accept bearer-only and will not enforce PoP.
- Requests to listed paths must include a client cert (forwarded) and the JWT must contain
Kubernetes ingress snippets and CA rotation
- Traefik (CRDs) – mTLS options and passTLSClientCert middleware. Rotate CA by updating the Secret referenced in Traefik and reloading.
- Nginx Ingress –
auth-tls-verify-client: onandauth-tls-secret: <ns>/mtls-ca. Rotate CA by updating the Secret; upstream header isssl-client-cert.
Runbook (rotation):
- Prepare new CA bundle Secret (ensure old + new overlap if needed).
- Update Traefik/Nginx to reference new Secret; apply rollout.
- Verify with a canary client cert; then remove old CA from bundle.
Curl examples
- Dev (Traefik mTLS):
curl -sS https://crud.local/workflow/start \
--cert client.pem --key client.key \
-H "Authorization: Bearer $JWT" -d '{"workflow_name":"demo","data":{}}'
- PoP mismatch troubleshooting: check that
cnf.x5t#S256in the JWT equals the SHA-256 DER thumbprint of the client cert; otherwise server returns 401.
Config validation
-
On startup we warn (non-blocking) when:
INBOUND_AUTH_MODEis unknownBINDING_REQUIRED_PATHScontains malformed entriesCERT_IDENTITY_MAPPINGSpoints to a non-existent file
-
App config (centralized under
/app/configwith env overrides)- Extend
pdp.yamlsecuritysection or addsecurity.mtlswith:inbound_auth_mode(default for endpoints),binding_required_paths(optional per-path overrides)forwarded_client_cert_header,trusted_proxy_cidrs,identity_mappings_path
- Provide environment overrides in
docker-compose-authzen4.ymlfor dev:INBOUND_AUTH_MODE=bearer_plus_mtls_optional(Phase 2), laterbearer_plus_mtls_requiredFORWARDED_CLIENT_CERT_HEADER=X-Forwarded-Tls-Client-CertTRUSTED_PROXY_CIDRS=172.16.0.0/12,10.0.0.0/8CERT_IDENTITY_MAPPINGS=/app/config/cert_identity_mappings.yaml
- Extend
-
Middleware:
src/middleware/client_cert.py- Implement
ClientCertExtractionMiddleware:- If
remote_addrnot inTRUSTED_PROXY_CIDRS, ignore/strip any client-cert headers - Read PEM from
FORWARDED_CLIENT_CERT_HEADER, parse viacryptography - Compute x5t#S256 (SHA-256 of DER, base64url, no padding)
- Extract: subject DN, issuer DN, SANs (DNS, URI, RFC822), notBefore/notAfter, policy OIDs
- Set on
scope['state']andrequest.state:mtls_thumbprint,client_cert_*
- If
- Defensive parsing: handle multi-line headers, large PEMs, malformed input; cap size
- Implement
-
Wire middleware in
src/main.py- Add before
PrincipalHeadersMiddlewareto ensure state is available to authn/authz:app.add_middleware(ClientCertExtractionMiddleware, ...)
- Ensure order: iterator adapter outermost, then security headers/origin/cert extraction, then principal headers
- Add before
-
Identity mapping module:
src/utils/cert_identity_mapper.py- Load
cert_identity_mappings.yaml(fail-closed ifmtls/binding_requiredmode and no mapping allows) - Produce canonical ARN: default
auth:account:x509:sha256:{x5t}; allow SAN/issuer policy-based mappings - Validate constraints: allowed issuers/roots, SAN domain allowlist, policy OIDs, validity window
- Load
-
AuthN changes:
src/services/authn_service.py- Read
inbound_auth_modeand per-path overrides from config - If mode is
mtlsand cert present:- Build
UniqueIdentityfrom cert mapping; skip JWT introspection
- Build
- If mode is
bearer_plus_mtls_optionaland both present:- Prefer JWT identity but set
request.state.mtls_thumbprintfor sender-binding - If
binding_required_pathsmatch, enforce PoP (token.cnf.x5t#S256 == mtls_thumbprint)
- Prefer JWT identity but set
- If mode is
bearer_plus_mtls_required:- Require both JWT and cert; enforce PoP, else
401 invalid_tokenwith troubleshooting
- Require both JWT and cert; enforce PoP, else
- Populate:
request.state.subject,request.state.client_id,request.state.unique_identityconsistently for downstream
- Read
-
AuthZ:
src/services/authz_service.py(validate context only)- No core changes; ensure normalized identity is used (cert ARN or JWT unique_id)
- Log principal ARN, mtls thumbprint presence, and enforcement outcome for audit
-
Secrets sender-binding:
src/services/vault_service.py- Confirm
cnf_binding = mtls_thumbprint or cnf_jktpath works - Add tests to ensure
grant.cnf_bindingdrift is detected whenmtls_thumbprintchanges within TTL
- Confirm
-
Observability and audit
- Metrics: add counters/histograms
mtls_auth_success_total,mtls_auth_failure_total{reason},mtls_pop_mismatch_total- Latency histograms for middleware parse and auth pipeline
- Kafka: publish security events on success/failure (mask sensitive fields, include thumbprint only)
- Structured logs: include
correlation_id,mtls_thumbprint_present,binding_enforced
- Metrics: add counters/histograms
-
Tests
- Unit
- Middleware: parse valid PEM, invalid PEM, no header, untrusted proxy, large header
- Identity mapper: mapping precedence, constraint failures (issuer/SAN/policy/expiry)
- AuthN modes: each mode with and without cert/JWT; PoP pass/fail
- Integration (dev via Traefik)
- Compose with mtls enabled; hit CRUD endpoints using client cert; verify 200 vs 401 on PoP mismatch
- E2E
- Critical flows on
/workflow/*and/executewith mode permutations
- Critical flows on
- Unit
-
Runbook and rollout
- Phase toggles via env/config; start with
bearer→bearer_plus_mtls_optional(observe) →bearer_plus_mtls_required - CA rotation procedure (Traefik/Nginx + app policy reload)
- Incident steps for revocation, thumbprint blocklisting (temporary deny-list in mappings)
- Phase toggles via env/config; start with
-
Security hardening
- Strip client-cert headers from untrusted sources (always)
- Enforce TLSv1.2+ (prefer TLSv1.3) and FIPS ciphers at proxy
- Rate-limit cert-auth failures; alert on spikes
- Validate that cert clock-skew handling is sane; apply small leeway only if required
-
Documentation updates
- Add
ServiceConfigs/CRUDService/config/cert_identity_mappings.yamlexample with comments - Update
traefik/dynamic.ymlexample and K8s ingress examples - Provide
curlexamples for mTLS (dev) and PoP-mismatch troubleshooting
- Add
Role-based quick guides and FAQs
For Product Managers (non-technical)
- What is being added? Inbound mutual TLS (mTLS) so API clients present a client certificate. We can optionally require their JWT to be bound to that certificate for stronger proof-of-possession (PoP).
- Why it matters: Prevents token replay and enforces that only devices with approved certificates can access sensitive endpoints.
- Modes and impact:
- bearer: current behavior, no change to clients.
- mtls: cert-only clients; no JWT needed. Good for service accounts.
- bearer_plus_mtls_optional: JWT still works; certain paths can require PoP.
- bearer_plus_mtls_required: strongest; clients must send both JWT and cert and they must match.
- Rollout safety: adopt optional mode first, monitor, then enforce on high-risk endpoints only.
- SLA impact: negligible under normal load; cert parsing and PoP check are microseconds compared to network I/O.
Common PM questions
- How do we identify clients? Certificate → canonical ARN (e.g.,
auth:account:x509:sha256:{thumbprint}or email SAN when allowed). - Can we block a compromised cert quickly? Yes: remove it from mappings or rotate CA at the proxy; app also enforces SAN/issuer/policy constraints.
- What changes for UIs? None directly. This feature targets API/service clients.
For DevOps (enable/operate)
- Dev (Docker + Traefik):
-
Enable mTLS in
traefik/dynamic.ymlwithtls.options.mtlsandpassTLSClientCertmiddleware. -
Ensure CRUD router labels include
...tls.options=mtls@fileand...middlewares=mtls-passcert@file,.... -
App env: set
FORWARDED_CLIENT_CERT_HEADER=X-Forwarded-Tls-Client-Cert, trusted CIDRs for the bridge, andINBOUND_AUTH_MODE. -
Optional: set
FORWARDED_CLIENT_CERT_HMAC_SECRETandFORWARDED_CLIENT_CERT_SIGNATURE_HEADERto require a signed forwarded cert header. -
Safer default for teams: mTLS is disabled by default in dev; opt in when ready
- Default compose now routes without mTLS to avoid blocking teammates without a CA.
- To enable mTLS locally, run with the override:
docker compose -f docker-compose-authzen4.yml -f docker-compose.dev-mtls.yml up -d --build - To revert to non-mTLS quickly, stop and start without the override file.
-
Traefik: strip and re-add client-cert headers (order matters)
In
traefik/dynamic.ymladd a headers middleware to remove any client-cert-like headers from inbound requests, then applypassTLSClientCertto attach the verified cert after TLS client auth:http:
middlewares:
strip-external-client-cert:
headers:
removeRequestHeaders:
- X-Forwarded-Tls-Client-Cert
- X-Forwarded-Client-Cert
- X-Forwarded-Client-Cert-Chain
- X-Forwarded-Tls-Client-Cert-Signature
- XFCC
mtls-passcert:
passTLSClientCert:
pem: true
info:
subject: true
issuer: true
notBefore: true
notAfter: true
sans: true
tls:
options:
mtls:
minVersion: VersionTLS12
clientAuth:
clientAuthType: RequireAndVerifyClientCert
caFiles:
- /certs/ca.crtUpdate the CRUD router to ensure middleware order is: strip → passcert → security/rate-limit:
- "traefik.http.routers.crud.tls.options=mtls@file"
- "traefik.http.routers.crud.middlewares=strip-external-client-cert@file,mtls-passcert@file,security-headers@file,rate-limit@file" -
CA setup (dev): ensure
/certs/ca.crtexists in TraefikTraefik mounts
../IdP/certsas/certs. Create a dev CA and client cert, and place the CA at../IdP/certs/ca.crt:# 1) Dev CA
openssl req -x509 -newkey rsa:2048 -keyout ca.key -out ca.crt -days 3650 -nodes -subj "/CN=EmpowerNow Dev CA"
# 2) Client key & CSR
openssl req -newkey rsa:2048 -keyout client.key -out client.csr -nodes -subj "/CN=dev-client"
# 3) Sign client cert with CA
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.pem -days 365 -sha256
# 4) Place CA where Traefik reads it
mkdir -p ../IdP/certs
cp ca.crt ../IdP/certs/ca.crt
# 5) Test a request (requires client cert)
curl -k --resolve crud.ocg.labs.empowernow.ai:443:127.0.0.1 \
--cert client.pem --key client.key \
https://crud.ocg.labs.empowernow.ai/healthTeam-friendly option: commit a public dev CA certificate (no private key) at
IdP/certs/ca.crt, and provide a small script to generate per‑developer client certs.
-
- Prod (K8s):
-
Traefik: reference
tls.options mtlsand apply passTLSClientCert middleware; CA via Secret. -
Nginx: use
auth-tls-verify-client: "on"andauth-tls-pass-certificate-to-upstream: "true"; set app header tossl-client-cert. -
CA setup (prod): create a Secret with your CA bundle and reference it from the ingress controller
# Create a secret with your CA chain (no private keys)
kubectl create secret generic mtls-ca \
--from-file=ca.crt=./prod-ca.crt \
-n traefik- Traefik (CRD) snippet:
apiVersion: traefik.containo.us/v1alpha1
kind: TLSOption
metadata:
name: mtls
namespace: traefik
spec:
minVersion: VersionTLS12
clientAuth:
secretNames:
- mtls-ca
clientAuthType: RequireAndVerifyClientCert- Nginx Ingress annotations:
metadata:
annotations:
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/auth-tls-secret: "<ns>/mtls-ca"
nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"CA rotation: replace the Secret, roll the ingress controller, canary test, then remove old CA from the bundle.
-
- CA rotation runbook: update Secret, reload ingress, canary test, then remove old CA.
- Observability:
- Metrics counters: mTLS auth successes/failures, PoP mismatches (see MetricsCollector usage).
- Kafka audit (best-effort): success/failure events with minimal data (thumbprint, reason).
- FIPS posture: enforce TLS1.2+/1.3 and AES-GCM at the edge; ensure OpenSSL FIPS mode on ingress layer as required.
DevOps FAQ
- How to verify header forwarding? Inspect Traefik dashboard router/middleware, and in app logs look for
mtls_thumbprint_present=trueon requests. 401 withsender_binding_mismatchindicates PoP mismatch is working. - How to simulate client cert in dev? Use curl with
--cert/--keyor a test client behind the proxy. - What breaks clients? Only when switching to
bearer_plus_mtls_requiredormtlsfor their endpoints without provisioning certs and JWT PoP.
For QA (verify/accept)
- Acceptance criteria
- Optional mode: listed paths in
BINDING_REQUIRED_PATHSenforce PoP; non-listed paths accept bearer-only. - Required mode: missing cert → 401; PoP mismatch → 401; matching PoP → 200.
- mtls mode: no JWT required; identity derived from cert mapping.
- Sender-binding drift (secrets): second request with different thumbprint for same subject/resource is rejected.
- Optional mode: listed paths in
- Test matrix (examples)
| Mode | Path listed | JWT | Cert | PoP match | Expected |
|---|---|---|---|---|---|
| optional | yes | yes | yes | yes | 200 |
| optional | yes | yes | yes | no | 401 |
| optional | no | yes | no | n/a | 200 |
| required | n/a | yes | no | n/a | 401 |
| required | n/a | yes | yes | no | 401 |
| required | n/a | yes | yes | yes | 200 |
| mtls | n/a | no | yes | n/a | 200 |
QA tips
- Use curl examples below to simulate; validate
WWW-Authenticateerror details on failures. - Confirm sender-binding drift by requesting a secret twice with two different certs for the same subject.
Sequence: request flow
Flow: mode and PoP enforcement
FAQs and troubleshooting
- How do I generate a test client certificate?
openssl req -x509 -newkey rsa:2048 -keyout client.key -out client.pem -days 365 -nodes -subj "/CN=test-client"
- Curl examples
- mTLS + bearer:
curl -sS https://crud.local/workflow/start \
--cert client.pem --key client.key \
-H "Authorization: Bearer $JWT" -d '{"workflow_name":"demo","data":{}}'
- PoP mismatch (expect 401): issue a JWT with different
cnf.x5t#S256than the client cert x5t. - Common errors
- 401 invalid_token, reason=certificate_missing: proxy not forwarding cert or untrusted proxy.
- 401 sender_binding_mismatch: JWT
cnf.x5t#S256doesn’t match cert thumbprint. - 403 binding_drift (secrets): thumbprint changed for same subject/resource.
- 400 on PEM parse: header malformed or exceeds
MAX_CLIENT_CERT_SIZE(default 32KB). Duplicate header → 400. Missing/bad signature → 400 when HMAC is enabled.
- Verifying the forwarded header exists
- Check proxy config and app logs; ensure
FORWARDED_CLIENT_CERT_HEADERmatches ingress header name.
- Check proxy config and app logs; ensure
- Kafka audit not visible?
- Ensure broker reachable (inside Docker use
kafka:9092, from host uselocalhost:9093). Audits are best-effort and won’t block requests.
- Ensure broker reachable (inside Docker use
Trust boundary hardening (recommended)
- At the proxy:
- Strip any inbound client-cert/XFCC headers from external requests before mTLS verification.
- After successful client cert verification, re-add only the canonical forwarded header with the verified PEM.
- Optionally sign the forwarded header (HMAC) using a shared secret; validate signature in the app.
- Prefer TLS 1.3; for TLS 1.2 restrict to AES-GCM suites and strong curves (P-256/P-384). Disable renegotiation.
- Between proxy → app:
- Prefer mTLS on the internal hop, or use signed forwarded headers to prevent tampering.
- In the app middleware:
- Enforce single-value header (reject duplicates). Cap size (default 32KB). Normalize percent-encoding and
\n. - Fail closed with clear 4xx when header is duplicated, oversized, or malformed; log reason-coded metrics.
- Enforce single-value header (reject duplicates). Cap size (default 32KB). Normalize percent-encoding and
- Identity mapping:
- Precedence: thumbprint → SAN:email → SAN:URI/DNS/UPN. Log which rule matched; validate chain depth and issuer allow-list, not only root CA.
- Incident response:
- Support a hot-reloadable thumbprint blocklist and short OCSP/CRL cache TTLs for high-risk endpoints.
Certificate identity mapping (cert_identity_mappings.yaml)
-
Location and wiring
- File lives in your mounted config:
ServiceConfigs/CRUDService/config/cert_identity_mappings.yaml→ container path/app/config/cert_identity_mappings.yaml. - Environment variable
CERT_IDENTITY_MAPPINGS=/app/config/cert_identity_mappings.yamlenables it (already set in compose). - Used by the authentication service when building a canonical ARN for mTLS identities.
- File lives in your mounted config:
-
Purpose
- Control how a certificate becomes a principal ARN and constrain which certs are accepted (issuers, SAN domains, policy OIDs).
- Precedence is explicit and deterministic:
thumbprint_mapoverride → SAN email mapping (when allowed) → default thumbprint ARN.
-
Fields
thumbprint_map(map): overrides specific x5t#S256 thumbprints to a concrete ARN.constraints.allowed_issuers(list of substrings): issuer DN must contain at least one.constraints.allowed_email_domains(list): RFC822 SAN must end with one of these domains to allow email-based mapping.constraints.allowed_policy_oids(list, optional): require at least one policy OID (e.g., hardware token profile). If absent, policy OIDs aren’t enforced.
-
ARN formats (canonical)
- Default thumbprint mapping:
auth:account:x509:sha256:{thumbprint} - Email SAN mapping:
auth:account:x509:email:{address} - You can map to any canonical ARN your PDP understands, e.g.,
auth:account:x509:device:build-agentfor a CI runner.
- Default thumbprint mapping:
-
Real-world examples
Minimal (default to thumbprint ARNs):
thumbprint_map: {}
constraints:
allowed_issuers: [] # any issuer
allowed_email_domains: ["corp.tld"]Pin a CI agent and restrict issuers and SAN domain:
thumbprint_map:
mQz0sE5e1E-6XfOUkqH8yYl2o9g0mY0y2X2qKX7fUOQ: auth:account:x509:device:ci-prod-build-agent
constraints:
allowed_issuers:
- "CN=EmpowerNow Dev CA"
- "CN=Corp Root CA"
allowed_email_domains:
- "corp.tld"
allowed_policy_oids:
- "2.23.140.1.3" # example hardware-token policy OIDRoute specific users by email SAN while leaving others as thumbprint ARNs:
thumbprint_map: {}
constraints:
allowed_issuers: ["CN=EmpowerNow Dev CA"]
allowed_email_domains: ["partner.example"] -
Computing x5t#S256 (thumbprint)
# DER → SHA-256 → base64url (no padding)
openssl x509 -in client.pem -outform der \
| openssl dgst -sha256 -binary \
| openssl base64 -A \
| tr '+/' '-_' | tr -d '=' -
Testing changes in dev
- Edit
ServiceConfigs/CRUDService/config/cert_identity_mappings.yaml. - Restart
crud-service(the file is read on startup). - Call an endpoint that triggers identity mapping, e.g.,
/executeinmtlsmode or any endpoint with mTLS present. - Verify logs include the mapped ARN and that constraints are enforced (e.g., issuer mismatch → 401 with reason).
- Edit
-
Blocking a compromised cert quickly
- Prefer the dedicated blocklist: set
CERT_THUMBPRINT_BLOCKLIST=/app/config/blocked_thumbs.txtand list x5t#S256 values (one per line). The middleware returns401 certificate_blocklistedfor matches. - You can also remove entries from
thumbprint_mapor tightenallowed_issuers/allowed_email_domains.
- Prefer the dedicated blocklist: set
Related
- Reference → mTLS (Design & Guide)
- BFF → Upstream TLS/mTLS to CRUD: ../../bff/how-to/enable-upstream-tls-to-crud.md