Verifying What's Actually Running in Production: Build Diff vs Runtime Reality

Your deployment manifest says payment-api:v2.14.7. Your registry holds an image with digest sha256:9f3a.... Your Kubernetes pod reports it pulled sha256:9f3a.... Three sources agree. So the running container is the build artifact, right?

Not necessarily. And when an autonomous coding agent has been iterating on your Dockerfile for the last two weeks, the gap between built and running is exactly where production incidents hide.

This article shows how to hash what is actually executing in a production container, diff it against the SBOM and layer manifest of the build artifact, and surface drift introduced by agent-authored Dockerfile changes before it causes an outage.

The Three Layers of "What's Running"

When someone says "the v2.14.7 image is in production," they usually mean one of three things:

The deployment manifest says v2.14.7. A YAML claim. Cheap. Often wrong.
The pod's imageID is sha256:9f3a... Closer. This is the digest the kubelet pulled. Still not proof of what's executing.
The hash of the files inside the running container is X. This is ground truth. And almost nobody checks it.

The gap between (2) and (3) is the gap between "we shipped what we built" and "we hope we shipped what we built."

Where Agent-Authored Drift Sneaks In

A coding agent rarely modifies a running container directly. The drift it introduces happens earlier, in three places that the deployment manifest cannot see:

Dockerfile rewrites — an agent refactors the Dockerfile to "simplify" a multi-stage build and quietly drops a --chown=nonroot directive. The image still builds. Permissions are now wrong at runtime.
Base image bumps — an agent updates FROM python:3.12-slim to FROM python:3.13-slim because it noticed a CVE. Transitive packages shift. The lockfile inside the container no longer matches the lockfile in the repo.
Build-arg expansions — an agent introduces ARG PIP_INDEX_URL to support a private mirror. CI builds work because CI passes the arg. A developer's local build pulls from the wrong index. The resulting image contains different wheels than CI produced.

In all three cases, your image digest is honest — it does describe the bytes in the registry. But it does not describe what the agent meant to ship, and it does not describe what the source repo claims is in there.

Step 1: Hash What's Actually in the Container

The cheapest runtime fingerprint is a recursive hash of the writable filesystem layers. For a running pod:

# Pick a target pod
POD=$(kubectl get pods -n production -l app=payment-api \
  -o jsonpath='{.items[0].metadata.name}')

# Hash every regular file under /app (the application root)
kubectl exec -n production "$POD" -- sh -c '
  find /app -type f -not -path "*/\.*" \
    | sort \
    | xargs sha256sum \
    | sha256sum
' > runtime.fingerprint

For distroless images that have no shell (which you should prefer — see the gold images guide), use an ephemeral debug container:

kubectl debug -n production "$POD" \
  --image=cgr.dev/chainguard/wolfi-base:latest \
  --target=app \
  -it -- sh -c 'find /proc/1/root/app -type f | xargs sha256sum | sha256sum'

You now have one hash that summarises the contents of the running application directory. Save it alongside the pod name, the image digest, and a timestamp.

Step 2: Hash the Build Artifact the Same Way

Pull the image your CI system pushed and run the same fingerprint locally or in a sandboxed runner. Scope the fingerprint to the same immutable application directory you hashed in Step 1 (/app here), not the whole root filesystem — see the caveat below.

IMAGE="ghcr.io/myorg/payment-api@sha256:9f3a4d2e..."
docker pull "$IMAGE"
CID=$(docker create "$IMAGE")
EXTRACT=$(mktemp -d)
trap 'docker rm "$CID" >/dev/null; rm -rf "$EXTRACT"' EXIT

# Extract the entire filesystem once (a single docker API call) instead
# of `docker cp` per file. Avoids per-file API overhead and avoids the
# fragile `tar -tvf | awk '{print $6}'` filename-extraction pass — the
# `-tvf` column count differs between GNU and BSD tar (macOS) and
# breaks for filenames with whitespace or non-ASCII characters.
docker export "$CID" | tar -xC "$EXTRACT"

# Hash file contents only, and ONLY under the immutable app dir — find
# -type f excludes directories, device nodes, and symlinks (which have
# no content). LC_ALL=C pins sort order so the final hash is stable
# across locales. The leading "." in the runtime fingerprint's paths
# (find /app vs find ./app) will differ — normalise both sides to the
# same relative root before diffing, or compare the single summary
# hash, not the per-file lines.
( cd "$EXTRACT" && find ./app -type f -print0 \
    | LC_ALL=C sort -z \
    | xargs -0 sha256sum ) \
  | sha256sum > build.fingerprint

diff runtime.fingerprint build.fingerprint

Do not expect a clean whole-filesystem diff. A running container's root filesystem is never byte-identical to the exported build artifact: the kubelet rewrites /etc/hosts, /etc/resolv.conf, and /etc/hostname; the app writes logs, PID files, caches, and temp files; /proc, /sys, /dev, and mounted volumes diverge by design. Hash a directory you control and treat as read-only after build (the application root, a vendored-dependencies dir), and explicitly exclude paths the app legitimately writes to at runtime.

Within that scoped directory: if the summary hash matches, the application files in the running container match the build artifact. If it differs, you have drift in code you expected to be immutable — and the next question is what kind. Some differences are still benign (a config file rendered from env vars at startup, for example); maintain an allowlist of expected post-startup mutations and exclude those paths, as noted in Common Failure Modes below.

Step 3: SBOM Diff for Package-Level Drift

A filesystem hash tells you "something changed." An SBOM diff tells you what. Generate an SBOM from both sides and compare:

# SBOM from the build artifact
syft "$IMAGE" -o cyclonedx-json > build.sbom.json

# SBOM from the running container (via debug container).
#
# Caveats for the /proc/1/root path:
# - --target=app shares the target container's PID namespace into the
#   ephemeral container. /proc/1/root then resolves to the target's
#   filesystem root via the kernel's procfs. The cluster's
#   PodSecurity/PSA profile must allow ephemeral containers (debug
#   profile) and the kernel must permit traversal across PID
#   namespaces — typically requires CAP_SYS_PTRACE on the ephemeral
#   container, which `kubectl debug --profile=general` (the default
#   from k8s 1.27+) does not grant. Use `--profile=sysadmin` if your
#   cluster admin permits it. When it is blocked (PSA `baseline` /
#   `restricted` namespaces commonly forbid the elevated capabilities
#   the sysadmin profile needs), see "Locked-down clusters" below for
#   working alternatives.
# - syft scanning a live /proc mount may pick up phantom packages
#   from /proc/*/cwd and similar self-referential symlinks. Treat the
#   runtime SBOM as authoritative for installed packages, not for
#   process-level state.
kubectl debug -n production "$POD" \
  --image=anchore/syft:latest \
  --target=app \
  --profile=sysadmin \
  -- /syft /proc/1/root -o cyclonedx-json > runtime.sbom.json

# Diff them
jq -S '.components | sort_by(.purl)' build.sbom.json   > build.sorted
jq -S '.components | sort_by(.purl)' runtime.sbom.json > runtime.sorted
diff build.sorted runtime.sorted

A clean diff confirms package parity. Anything that shows up only in the runtime SBOM is suspicious — it was added after the image was built, which means an init container, a sidecar copy, a kubectl exec, or something worse.

Step 4: Tie the Diff Back to the Agent

This is where the workflow earns its keep. When you find drift, the question is not just "what changed" but "who changed it, when, and why." If your build pipeline tags agent-authored commits (see tagging agent-authored code in git), you can trace any drift in the Dockerfile or build configuration back to the prompt that produced it.

A typical lookup:

# What commit produced this image?
# `crane config` returns the image config JSON directly (one call, no
# manifest hop, no second blob fetch). The labels live at .config.Labels
# in the config blob. The earlier `crane manifest | jq | xargs crane blob`
# form is broken when $IMAGE already carries an @sha256: digest — it
# produces an invalid double-`@` reference that crane rejects.
COMMIT=$(crane config "$IMAGE" \
  | jq -r '.config.Labels."org.opencontainers.image.revision"')

# Was the Dockerfile touched by an agent in that commit?
git log --format='%H %an %s' -1 "$COMMIT" -- Dockerfile
git show "$COMMIT" -- Dockerfile | grep -i 'co-authored-by:'

If the trailer reads Co-Authored-By: Claude <[email protected]> or Co-Authored-By: copilot-coding-agent, you now have the agent, the prompt context, and the diff in one place.

Step 5: Continuous Verification, Not Spot Checks

Running this once during an incident is fire-fighting. Running it on a schedule turns it into a control. A minimal cron-driven verifier:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: runtime-image-verifier
  namespace: security
spec:
  schedule: "*/15 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: image-verifier
          containers:
            - name: verifier
              image: ghcr.io/myorg/runtime-verifier:v1.4.0
              args:
                - --namespace=production
                - --label-selector=tier=critical
                - --fail-on-drift=true
                - --report-to=https://security.myorg.com/drift
          restartPolicy: OnFailure

The verifier walks every pod matching the selector, runs the fingerprint and SBOM diff, and posts results to a central endpoint. Any drift opens a ticket. Repeated drift on the same service triggers a quarantine policy in Kyverno.

What Good Looks Like

A team that has this control in place can answer four questions in under five minutes during an incident:

Is this pod running the artifact CI built? Yes/no, by fingerprint.
If not, what changed? Package list, by SBOM diff.
Was the Dockerfile touched by an agent recently? Yes/no, by git trailer.
Is the same drift visible in other pods? Count, by report aggregation.

Without these answers, "we deployed v2.14.7" is a comforting hypothesis. With them, it is a verified claim.

Common Failure Modes

Distroless images with no debug tooling — solve with kubectl debug --image=... ephemeral containers, do not bake sh into the runtime image.
Locked-down clusters where --profile=sysadmin is blocked — PSA baseline/restricted namespaces often forbid the capabilities the sysadmin debug profile needs, so you cannot read the live /proc/1/root filesystem from an ephemeral container. Options, in order of preference: (1) verify the artifact by its digest instead of the live process — the pod's imageID (sha256:...) is the ground-truth reference; syft <image>@<digest> and cosign verify-attestation against that digest confirm what the kubelet pulled, which is sufficient unless you suspect post-pull, in-container mutation. (2) If you have node access, query the container runtime directly on the node (crictl inspect <container-id> for the rootfs path, then scan it) — this needs node SSH or a privileged DaemonSet your platform team controls, not pod-level permissions. (3) As a last resort, run the verifier as a dedicated security workload in its own namespace with a sysadmin-equivalent profile that your cluster admin has explicitly exempted, rather than weakening the application namespace's PSA profile. Document which of these your environment permits before an incident — discovering the operation is blocked mid-incident wastes the minutes this control is meant to save.
Filesystem timestamps polluting the hash — hash file contents, not metadata. Strip timestamps and ownership from the input to sha256sum.
Init containers that legitimately mutate /app — record an allowlist of expected post-startup mutations (config templates rendered from env vars, for example) and exclude those paths from the diff.
Build reproducibility gaps — if your image is not reproducible byte-for-byte, fingerprint diffs become noisy. Pin base image digests, set SOURCE_DATE_EPOCH, and avoid :latest (see pinning base images when AI agents author Dockerfiles).

The Wider Point

Provenance does not stop at the registry. A signed image, a verified SBOM, and a SLSA attestation tell you what was built. None of them tell you what is running. When agents are writing Dockerfiles, build scripts, and Helm charts, the gap between built and running widens — quietly, invisibly, until an incident exposes it.

Hashing the runtime is the cheapest way to close that gap. Do it on every critical pod, on a schedule, and tie the results back to the commits — and the agents — that produced them.

The Three Layers of "What's Running"

Where Agent-Authored Drift Sneaks In

Step 1: Hash What's Actually in the Container

Step 2: Hash the Build Artifact the Same Way

Step 3: SBOM Diff for Package-Level Drift

Step 4: Tie the Diff Back to the Agent

Step 5: Continuous Verification, Not Spot Checks

What Good Looks Like

Common Failure Modes

The Wider Point

References

AI Code Traceability — Your developers don't write the code

Continue Reading

Container Provenance for AI-Generated Builds: SLSA Attestations When the Source Is Half Human, Half Agent

Pinning Base Images When AI Agents Author Dockerfiles

SBOM Diff for Container Updates Authored by Coding Agents