Infrastructure Security

HashiCorp Vault for application secrets: getting off environment variables.


Environment variables are how secrets get into applications at most organisations. They're also one of the most consistent findings in security audits โ€” secrets in .env files committed to git, secrets visible in ps aux output, secrets shared between staging and production because nobody updated them, secrets that haven't rotated in three years because rotating them requires a deployment.

HashiCorp Vault solves a different set of problems than a simple secrets store. The key distinction is dynamic credentials: instead of storing a long-lived database password and handing it to your application, Vault creates a database user with a time-limited password on demand, gives it to your application, and revokes it when the lease expires. Your application never holds a credential that's valid longer than it needs to be.

This post covers how to actually deploy Vault, connect it to your applications, and migrate from env-var secrets โ€” including the operational realities that the documentation glosses over.

The problem with environment variables

Env vars feel safe because they're not in the code. They're not. The failure modes are well-documented and consistently exploited:

  • They leak into logs. Any unhandled exception that prints environment state exposes all secrets. Misconfigured logging middleware does this routinely. printenv from a debug endpoint is a classic mistake.
  • They're visible to all processes on the host. On Linux, /proc/[pid]/environ is readable by the process owner and often by root. In a container, any process in the container can read all env vars of all other processes.
  • They spread silently. .env files get committed to git. They get copied between machines. They get shared in Slack when someone needs to set up their local environment. A secret that's been in a .env file for two years has probably been in five places you don't know about.
  • They don't rotate. Rotating an env-var secret requires a deployment, which requires coordination, which means it rarely happens. Long-lived credentials are the rule, not the exception.
  • There's no audit trail. When a breach happens, you can't answer "which applications had access to this secret, and when did they read it?"

Vault's model

Vault treats secrets as a service, not a store. The core concepts:

  • Secrets engines are plugins that know how to generate, store, or negotiate credentials. The KV (key-value) engine stores static secrets. The database engine connects to your database and creates dynamic credentials. The PKI engine generates TLS certificates. The AWS engine generates temporary IAM credentials.
  • Auth methods determine how a client proves its identity to Vault. AppRole for applications, Kubernetes for pods, LDAP for humans, JWT/OIDC for integration with your identity provider. Each auth method issues a Vault token with specific policies attached.
  • Policies are Vault's access control model. They define which paths a token can read, write, or manage. A narrow policy for a specific application might only allow reading one database credential path.
  • Leases are time limits on credentials. Dynamic credentials have a TTL. When the TTL expires, Vault revokes the credential at the source. Applications can renew leases while they're healthy; compromised credentials expire automatically.

Deployment

For a single-team self-hosted deployment, run Vault in integrated storage mode (Raft) โ€” this replaced the previous Consul-based HA backend and is now the recommended path. Three nodes for HA, though a single node is appropriate for development and smaller teams who accept the single-point-of-failure tradeoff.

version: "3.9"
services:
  vault:
    image: hashicorp/vault:1.17
    command: server
    environment:
      VAULT_LOCAL_CONFIG: |
        ui = true
        disable_mlock = true
        storage "raft" {
          path    = "/vault/data"
          node_id = "vault-01"
        }
        listener "tcp" {
          address       = "0.0.0.0:8200"
          tls_cert_file = "/vault/tls/server.crt"
          tls_key_file  = "/vault/tls/server.key"
        }
        api_addr     = "https://vault.yourorg.com:8200"
        cluster_addr = "https://vault-01:8201"
    cap_add:
      - IPC_LOCK
    volumes:
      - vault_data:/vault/data
      - ./tls:/vault/tls:ro
    ports:
      - "8200:8200"

volumes:
  vault_data:

On first start, Vault is sealed โ€” it holds no secrets and will serve no requests until you provide unseal keys. Run vault operator init to generate the root token and unseal keys. Store these in separate secure locations. The unseal keys are not secrets Vault manages โ€” they're the keys to unlock Vault itself. Losing them means losing access to all secrets.

Auto-unseal in production. Manual unsealing means Vault requires human intervention after every restart โ€” a reboot at 3am becomes an outage. Configure auto-unseal using a cloud KMS (AWS KMS, GCP Cloud KMS, Azure Key Vault) or a hardware HSM. The unseal key is held in the KMS; Vault unseals automatically on startup. This is not optional for production.

Migrating from env vars: the KV engine first

Don't start with dynamic credentials. Start by migrating your existing static secrets into Vault's KV v2 engine. This gets your team familiar with the workflow and removes the most dangerous env-var patterns before introducing the complexity of dynamic credentials.

# Enable the KV v2 secrets engine
vault secrets enable -path=secret kv-v2

# Write your first secret
vault kv put secret/myapp/database \
  username="appuser" \
  password="$(cat /path/to/existing-password)"

# Read it back
vault kv get secret/myapp/database

# Read just the password field
vault kv get -field=password secret/myapp/database

KV v2 keeps a version history (default: 10 versions). This matters for auditing โ€” you can see when a secret was last updated and by whom. You can also roll back to a previous version if a rotation goes wrong.

AppRole auth for applications

Your applications need a way to authenticate to Vault without using another secret (the bootstrap problem). AppRole solves this with two components: a role-id (non-sensitive, can be in config) and a secret-id (sensitive, short-lived, injected at deployment time).

# Enable AppRole auth
vault auth enable approle

# Create a role for your application
vault write auth/approle/role/myapp \
  token_ttl=1h \
  token_max_ttl=4h \
  secret_id_ttl=10m \
  secret_id_num_uses=1

# Fetch the role-id (store in config, it's not secret)
vault read auth/approle/role/myapp/role-id

# Generate a secret-id (do this at deploy time, not in config)
vault write -f auth/approle/role/myapp/secret-id

The secret_id_num_uses=1 parameter means each secret-id can only be used once โ€” your application uses it to get a token, then it's invalid. This prevents secret-id leakage from being exploited after the fact. Your deploy pipeline generates a fresh secret-id per deployment and injects it via a mechanism that doesn't persist (a pipeline variable, not a file).

Dynamic database credentials

This is where Vault pays for its complexity. Enable the database secrets engine, connect it to your database, and Vault creates short-lived credentials on demand:

# Enable database secrets engine
vault secrets enable database

# Configure the PostgreSQL connection
vault write database/config/myapp-db \
  plugin_name=postgresql-database-plugin \
  allowed_roles="myapp-readonly,myapp-readwrite" \
  connection_url="postgresql://{{username}}:{{password}}@postgres:5432/myappdb" \
  username="vault_admin" \
  password="$VAULT_DB_ADMIN_PASS"

# Define a role with a templated SQL statement
vault write database/roles/myapp-readwrite \
  db_name=myapp-db \
  creation_statements="
    CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';
    GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO \"{{name}}\";
  " \
  default_ttl="1h" \
  max_ttl="24h"

# Request credentials
vault read database/creds/myapp-readwrite

Each call to vault read database/creds/myapp-readwrite creates a new PostgreSQL user with a 1-hour TTL. When the hour is up, Vault drops that user. Your application gets a credential that's unique to it, can't be reused by another application, and expires automatically.

The Vault Agent Injector for Kubernetes

If you're running on Kubernetes, the Agent Injector is the cleanest integration path. It runs as a webhook โ€” when a pod is created with the right annotations, the injector adds a Vault Agent sidecar that authenticates with Vault using the pod's service account, fetches secrets, and writes them as files into a shared volume.

apiVersion: v1
kind: Pod
metadata:
  annotations:
    vault.hashicorp.com/agent-inject: "true"
    vault.hashicorp.com/role: "myapp"
    vault.hashicorp.com/agent-inject-secret-db-creds: "database/creds/myapp-readwrite"
    vault.hashicorp.com/agent-inject-template-db-creds: |
      {{- with secret "database/creds/myapp-readwrite" -}}
      DATABASE_URL=postgresql://{{ .Data.username }}:{{ .Data.password }}@postgres:5432/myappdb
      {{- end }}
spec:
  serviceAccountName: myapp
  containers:
  - name: myapp
    image: myapp:latest
    # The secret is at /vault/secrets/db-creds, not in env vars
    command: ["/bin/sh", "-c"]
    args: ["source /vault/secrets/db-creds && exec /app/myapp"]

The Agent Injector handles lease renewal automatically. When the credential approaches expiry, the agent fetches fresh credentials and updates the file. Your application needs to watch the file for changes (or restart on a signal) โ€” this is the main application-level change required.

What actually breaks during migration

GOTCHA 01
Application restart on rotation
Most applications read env vars once at startup. Dynamic credentials require either a file-watching mechanism, a SIGHUP handler, or a connection pool that tests connections before use and re-authenticates on failure.
GOTCHA 02
Vault availability = your availability
If Vault is down and an application can't fetch credentials, that application can't start. Your Vault deployment is now on the critical path for every service. HA and monitoring are non-negotiable, not optional.
GOTCHA 03
Database connection pool churn
Dynamic database credentials mean your connection pool credentials change. Connection pools that cache credentials indefinitely will fail when the credential expires. Configure max connection age below the credential TTL.
GOTCHA 04
The chicken-and-egg seal problem
Vault is sealed after a restart. Without auto-unseal, everything that depends on Vault is down until a human intervenes. This is the operational reason auto-unseal with a KMS is mandatory in production, not optional.
GOTCHA 05
Audit log volume
Vault's audit log records every request, including every credential read. At scale this generates substantial log volume. Plan your log retention, aggregation, and alerting before enabling audit devices in production.
GOTCHA 06
Migration is not zero-downtime by default
Switching an application from env-var secrets to Vault-fetched secrets is a code change. Plan for a migration period where both paths exist, test thoroughly in staging, and deploy with the ability to roll back.

Vault in the 47Network stack

Every 47Network product and Studio deployment uses Vault for secrets management. The zero-trust Studio service includes Vault deployment as standard โ€” AppRole for application auth, Kubernetes auth for cluster workloads, dynamic PostgreSQL credentials where supported, and the PKI engine for internal certificate management.

The audit device is configured on every deployment and forwarded to the centralised audit log. When an incident requires forensics, the question "which application read this credential, and when?" has an answer.

The migration path we use for new Studio clients: KV v2 first (one sprint), AppRole auth for applications (one sprint), dynamic database credentials for the highest-risk databases (one sprint). Three sprints gets you from env vars to full Vault integration without breaking everything at once.


โ† Back to Blog Zero-Trust Studio service โ†’