How It Works: API Gateway & Traffic Control with Envoy


Updated: 2025-08-24

Summary

Envoy is a programmable edge proxy. Terminate TLS, route by host/path, enforce auth via ext_authz, handle WebSockets, shape traffic with retries/timeouts, and attach rate‑limit/backpressure.

Edge Responsibilities

- TLS termination (cert manager issues; Envoy mounts certs)
- Host/path routing and redirects
- Centralized auth (sessions/JWT via ext_authz to an auth service)
- WebSocket upgrades and idle timeouts
- Rate limiting (external service) and connection limits
- Header normalization, gzip/br encoding, HSTS

Minimal Listener + Virtual Hosts

static_resources:
  listeners:
  - name: https
    address: { socket_address: { address: 0.0.0.0, port_value: 443 } }
    filter_chains:
    - filter_chain_match: { server_names: ["api.example.com","auth.example.com"] }
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
          common_tls_context:
            tls_certificates:
            - certificate_chain: { filename: "/etc/envoy/tls/tls.crt" }
              private_key:       { filename: "/etc/envoy/tls/tls.key" }
      filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          upgrade_configs: [{ upgrade_type: websocket }]
          normalize_path: true
          request_timeout: 15s
          route_config:
            virtual_hosts:
            - name: api
              domains: ["api.example.com"]
              routes:
              - match: { prefix: "/v1" }
                route: { cluster: api_v1 }
            - name: auth
              domains: ["auth.example.com"]
              routes:
              - match: { prefix: "/" }
                route: { cluster: auth_svc }
          http_filters:
          - name: envoy.filters.http.router

  clusters:
  - name: api_v1
    connect_timeout: 0.5s
    type: STRICT_DNS
    load_assignment:
      cluster_name: api_v1
      endpoints:
      - lb_endpoints:
        - endpoint:
            address: { socket_address: { address: api.apps.svc.cluster.local, port_value: 8080 } }
  - name: auth_svc
    connect_timeout: 0.5s
    type: STRICT_DNS
    load_assignment:
      cluster_name: auth_svc
      endpoints:
      - lb_endpoints:
        - endpoint:
            address: { socket_address: { address: auth.apps.svc.cluster.local, port_value: 80 } }

Ext AuthZ (sessions/JWT) — allow/deny at the edge

# Add before router filter
http_filters:
- name: envoy.filters.http.ext_authz
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
    http_service:
      server_uri:
        uri: auth.apps.svc.cluster.local
        cluster: auth_svc
        timeout: 1s
      path_prefix: /oauth2/auth          # oauth2-proxy style
      authorization_request:
        allowed_headers:
          patterns: [{exact: cookie}, {exact: authorization}, {exact: x-forwarded-host}]
      authorization_response:
        allowed_upstream_headers:
          patterns: [{exact: x-auth-request-user}, {exact: x-auth-request-email}]
- name: envoy.filters.http.router

Retries, Timeouts, and Circuit Breakers

# Per-route overrides
route:
  timeout: 2s
  retry_policy:
    retry_on: 5xx,reset,connect-failure
    num_retries: 2
    per_try_timeout: 500ms

# Connection limits (protect backends)
circuit_breakers:
  thresholds:
  - max_connections: 1024
    max_pending_requests: 512
    max_requests: 2048

WebSocket Keep‑alives & Limits

# On the HTTP connection manager
stream_idle_timeout: 0s        # don't kill long WS connections
idle_timeout: 300s              # but cap idle HTTP streams

Cookie & Security Headers

# Example HSTS and frame options via header appender
response_headers_to_add:
- header: { key: "Strict-Transport-Security", value: "max-age=31536000; includeSubDomains" }
- header: { key: "X-Frame-Options", value: "DENY" }
- header: { key: "X-Content-Type-Options", value: "nosniff" }

Kubernetes Mount for TLS secrets (cert-manager)

# Deployment snippet
volumes:
- name: tls
  secret: { secretName: envoy-tls }
containers:
- name: envoy
  image: envoyproxy/envoy:v1.30.2
  volumeMounts: [{ name: tls, mountPath: /etc/envoy/tls, readOnly: true }]

Observability

- Access log JSON with request_id, user, route, duration.
- Prometheus metrics: requests, 4xx/5xx, p95 latency, open connections.
- Tracing: propagate W3C headers; sample thoughtfully.

Security

- mTLS for upstreams if feasible; strictly validate host headers.
- Lock down admin interface; never expose it publicly.
- Regularly rotate TLS keys/certs; use SDS for zero‑downtime rotation.

Pitfalls

- Global timeouts too long → slow failures cascade.
- Missing fall‑through 404 vhost → accidental default routing.
- Over‑broad ext_authz → auth service outage becomes total outage.

Taylor Swift

“You need to calm down.”


Comments

Popular posts from this blog

Learning to Automate My Side Projects with SWE-agent + GitLab

Ship-Ready Web Essentials: Search, Sitemap, Metadata & Icons (SvelteKit)

Kubernetes Secrets Management — SOPS + age (GitOps‑friendly)