How to Tag and Track AI-Generated Code in Git

Git's author metadata alone cannot tell you if a commit was written by a human or by an autonomous agent. The author field shows a name. It doesn't show intent, confidence, or model version.

To build audit compliance and enable meaningful verification, you need a tagging system that travels with every agent-authored commit. This requires three pieces: commit-time tagging, CI/CD validation, and queryable metadata storage.

Challenge 1: Agent Commits Don't Self-Identify

When GitHub Copilot, Cursor, Replit, or Claude Code writes a commit, the resulting git object looks like any other commit:

commit abc123def456
Author: Alice Chen <[email protected]>
Date:   Tue Apr 29 14:32:15 2026 +0000

    feat: add GET /health endpoint
    
    - Returns 200 with uptime metadata
    - Integrates with prometheus metrics

There's no field in git that says "this commit was authored by an AI agent." The commit message doesn't say it. The author field doesn't say it. If you search git history later, you can't distinguish this from a commit Alice actually wrote.

This is the first problem you need to solve.

Approach 1: Commit Author Tagging

Configure your agents to use a consistent, identifiable commit author.

For GitHub Copilot

GitHub Copilot's coding agent (the autonomous PR agent) commits with a fixed bot identity that you cannot rebrand from a config file: the author appears as Copilot with email <id>[email protected], and the committer is copilot-swe-agent[bot]. This identity is set by GitHub and is your reliable signal that a commit came from the agent — query git history against it directly:

# Find all Copilot coding-agent commits (matches both author and committer formats)
git log --all --author="Copilot" --oneline
git log --all --committer="copilot-swe-agent\[bot\]" --oneline

# Count commits per month authored by the Copilot agent
git log --after="2026-01-01" --before="2026-05-01" \
  --author="Copilot" --format="%ai" | cut -c1-7 | sort | uniq -c

What you can configure is the guidance the agent uses, via a .github/copilot-instructions.md file at the repo root. Use this to require the agent to add structured metadata to every commit message it produces:

# .github/copilot-instructions.md

## Commit message requirements

Every commit you author MUST include a trailer block with the
following keys, exactly as written:

```
AI-Agent: GitHub Copilot Coding Agent
AI-Model-Version: <model snapshot string>
AI-Inference-Date: <ISO 8601 UTC timestamp>
AI-Confidence-Score: <0.0–1.0>
```

Place the trailer block at the bottom of the commit message,
separated from the subject by a blank line. Do not omit any key.

This combination — fixed bot identity for filtering + repo-level instructions file for structured trailers — is the actual GitHub-supported surface for tagging Copilot agent commits.

For Cursor

Cursor's agent settings allow you to configure the commit author in your project's .cursor/config.json:

{
  "agent": {
    "commit_author": {
      "name": "Cursor Agent",
      "email": "[email protected]"
    },
    "model_version": "gpt-4-turbo-2024-04-09",
    "log_all_decisions": true
  }
}

For Custom Agents (LangChain, CrewAI, Claude Agent SDK)

When you build your own autonomous agent system, explicitly configure the agent identity:

import subprocess
from datetime import datetime, timezone

class AutonomousCodeAgent:
    def __init__(self, model_version="gpt-4-turbo-2024-04-09"):
        self.model_version = model_version
        self.agent_identity = "autonomous-code-agent"
        
    def commit(self, file_path, commit_message, changes):
        # Set git author to agent identity
        agent_email = f"{self.agent_identity}@company.local"
        
        subprocess.run([
            "git", "config", "user.name", self.agent_identity
        ])
        subprocess.run([
            "git", "config", "user.email", agent_email
        ])
        
        # Add metadata to commit message
        enhanced_message = f"""[agent-v3] {commit_message}

Model: {self.model_version}
Timestamp: {datetime.now(timezone.utc).isoformat()}
Agent-ID: {self.agent_identity}
"""
        
        subprocess.run(["git", "add", file_path])
        subprocess.run(["git", "commit", "-m", enhanced_message])

Approach 2: Git Commit Message Metadata

Beyond the author field, embed structured metadata in the commit message.

Minimal Format

[agent] <feature description>

Agent-Model: gpt-4-turbo-2024-04-09
Agent-Timestamp: 2026-04-29T14:32:15Z
Agent-Confidence: 0.78

Extended Format (EU AI Act Compliant)

For regulatory compliance, include more context:

[copilot-agent-v3] feat: add GET /health endpoint

Co-authored-by: [email protected] (review, approval)

---
AI-Agent: GitHub Copilot Coding Agent
AI-Model-Version: gpt-4-turbo-2024-04-09
AI-Inference-Date: 2026-04-29T14:32:15Z
AI-Prompt: "Add a /health endpoint to the Express server that returns uptime and basic metrics"
AI-Context-Window: 8192 tokens
AI-Temperature: 0.7
AI-Model-Provider: OpenAI
AI-Is-Autonomous: true
AI-Confidence-Score: 0.78
AI-Human-Edits-After-Generation: 1 (added error handling in catch block)
AI-Human-Reviewer: [email protected]
AI-Human-Approval-Timestamp: 2026-04-29T14:45:22Z
AI-SBOM-Updated: false
---

This format is:

Searchable — You can git log | grep "AI-Model-Version" to find all commits from a specific model
Parseable — CI/CD can extract fields programmatically
Audit-ready — supports the Article 19 minimum 6-month retention floor (longer if your sector law requires it) and aligns with EU AI Act Article 12 logging obligations

Parsing Commit Metadata

Build a small utility to extract agent metadata:

#!/bin/bash
# extract-agent-metadata.sh — extract AI agent metadata from git commits
#
# Commit bodies are multi-line, so we cannot `read hash subject body` from a
# single log stream. Instead, list the hashes, then read each commit's full
# body (%b) one at a time and grep its trailer lines.

git log --format="%H" | while read -r hash; do
  body=$(git log -1 --format="%b" "$hash")
  model=$(echo "$body" | grep "^AI-Model-Version:" | awk '{print $2}')
  confidence=$(echo "$body" | grep "^AI-Confidence-Score:" | awk '{print $2}')
  reviewer=$(echo "$body" | grep "^AI-Human-Reviewer:" | awk '{print $2}')

  if [ -n "$model" ]; then
    echo "$hash|$model|$confidence|$reviewer"
  fi
done | sort | uniq

Output:

abc123def456|gpt-4-turbo-2024-04-09|0.78|[email protected]
def456ghi789|gpt-4-turbo-2024-04-09|0.65|[email protected]
ghi789jkl012|gpt-4-turbo-2024-04-09|0.92|[email protected]

Approach 3: Signed Commits with Agent Identity

For non-repudiation (proving "the agent definitely wrote this"), use git's commit signing feature.

Create an Agent Signing Key

# Generate a GPG key for the agent
gpg --batch --generate-key <<EOF
    %echo Generating agent key
    Key-Type: RSA
    Key-Length: 4096
    Name-Real: Copilot Agent v3
    Name-Email: [email protected]
    Expire-Date: 0
    %commit
    %echo done
EOF

# Get the key ID
gpg --list-secret-keys --keyid-format=long [email protected]

Sign Agent Commits

When the agent commits, sign with its key:

# Configure git to use the agent key
git config user.signingkey <AGENT_KEY_ID>
git config commit.gpgsign true

# Commit with signature
git commit -S -m "[copilot-agent-v3] feature"

# Verify the signature
git verify-commit abc123def456

Output:

gpg: Signature made Tue Apr 29 14:32:15 2026 UTC
gpg: using RSA key ABC123DEF456
gpg: Good signature from "Copilot Agent v3 <[email protected]>"

This proves the commit came from the agent's key, not from someone impersonating it.

For even stronger guarantees, use Sigstore keyless signing (see Cryptographically Signing AI-Generated Artifacts with Sigstore).

Approach 4: CI/CD Enforcement

Add a pipeline gate that validates agent metadata before allowing merge:

GitHub Actions Example

name: Validate Agent Commits

on: [pull_request]

jobs:
  validate-agent-metadata:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Fetch all history
      
      - name: Check for agent commits
        run: |
          # Detect agent commits by matching the documented identities on
          # BOTH author (%an) and committer (%cn). GitHub Copilot's coding
          # agent authors as "Copilot" and commits as "copilot-swe-agent[bot]"
          # — neither contains the substring "agent" except the bot name, so
          # we match the identities explicitly. The pattern is case-insensitive.
          AGENT_RE='Copilot|copilot-swe-agent\[bot\]|[Cc]ursor|[Rr]eplit|agent'

          # Emit one hash per agent commit (author OR committer matches).
          HASHES=$( {
            git log origin/main..HEAD --format="%H" --author="$AGENT_RE" -E
            git log origin/main..HEAD --format="%H" --committer="$AGENT_RE" -E
          } | sort -u )

          if [ -z "$HASHES" ]; then
            echo "No agent commits in this PR."
            exit 0
          fi

          for hash in $HASHES; do
            echo "Agent commit detected: $hash"

            # Extract full commit message body
            MESSAGE=$(git log -1 --format="%b" "$hash")

            # Validate required fields
            if ! echo "$MESSAGE" | grep -q "AI-Model-Version:"; then
              echo "❌ Missing AI-Model-Version in commit $hash"
              exit 1
            fi

            if ! echo "$MESSAGE" | grep -q "AI-Inference-Date:"; then
              echo "❌ Missing AI-Inference-Date in commit $hash"
              exit 1
            fi

            if ! echo "$MESSAGE" | grep -q "AI-Human-Reviewer:"; then
              echo "❌ Missing AI-Human-Reviewer in commit $hash"
              exit 1
            fi

            echo "✓ Commit $hash has required metadata"
          done
      
      - name: Enforce signing on agent commits
        run: |
          AGENT_RE='Copilot|copilot-swe-agent\[bot\]|[Cc]ursor|[Rr]eplit|agent'

          HASHES=$( {
            git log origin/main..HEAD --format="%H" --author="$AGENT_RE" -E
            git log origin/main..HEAD --format="%H" --committer="$AGENT_RE" -E
          } | sort -u )

          for hash in $HASHES; do
            if ! git verify-commit "$hash" 2>/dev/null; then
              echo "❌ Agent commit $hash is not signed"
              exit 1
            fi
            echo "✓ Agent commit $hash is signed"
          done

This gate ensures every agent commit:

Has the agent author email (copilot-agent@...)
Includes required metadata in the message
Is cryptographically signed
Was reviewed by a human (human-reviewer field populated)

Approach 5: Centralized Agent Metadata Database

For large organizations, store agent commit metadata in a queryable database separate from git.

Schema

CREATE TABLE agent_commits (
  commit_hash VARCHAR(40) PRIMARY KEY,
  repository_name VARCHAR(256),
  agent_type VARCHAR(32),        -- 'copilot', 'cursor', 'replit', etc.
  agent_version VARCHAR(32),
  model_version VARCHAR(64),
  model_provider VARCHAR(32),
  inference_timestamp TIMESTAMP,
  confidence_score DECIMAL(3,2),
  human_reviewer VARCHAR(256),
  review_timestamp TIMESTAMP,
  is_autonomous BOOLEAN,
  sbom_updated BOOLEAN,
  deployment_target VARCHAR(128),
  deployment_timestamp TIMESTAMP,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_agent_type ON agent_commits(agent_type);
CREATE INDEX idx_deployed_timestamp ON agent_commits(deployment_timestamp);

Populating the Database

Hook into your CI/CD to POST metadata after each agent commit:

#!/bin/bash
# ci/push-agent-metadata.sh

COMMIT_HASH=$(git rev-parse HEAD)
AGENT_TYPE=$(git log -1 --format="%an" | grep -io 'copilot\|cursor\|replit' | head -1)

if [ -z "$AGENT_TYPE" ]; then
  echo "Not an agent commit, skipping"
  exit 0
fi

MODEL_VERSION=$(git log -1 --format="%b" | grep "AI-Model-Version:" | cut -d' ' -f2)
CONFIDENCE=$(git log -1 --format="%b" | grep "AI-Confidence-Score:" | cut -d' ' -f2)
REVIEWER=$(git log -1 --format="%b" | grep "AI-Human-Reviewer:" | cut -d' ' -f2)

curl -X POST https://api.company.local/agent-commits \
  -H "Authorization: Bearer $METADATA_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "commit_hash": "'$COMMIT_HASH'",
    "agent_type": "'$AGENT_TYPE'",
    "model_version": "'$MODEL_VERSION'",
    "confidence_score": '$CONFIDENCE',
    "human_reviewer": "'$REVIEWER'"
  }'

Querying Agent Code

Once metadata is centralized, you can answer compliance questions instantly:

-- How much code is from agents?
SELECT agent_type, COUNT(*) as commits
FROM agent_commits
WHERE deployment_timestamp IS NOT NULL
GROUP BY agent_type;

-- What % of deployed code was co-authored vs. autonomous?
SELECT is_autonomous, COUNT(*) as commits
FROM agent_commits
WHERE deployment_timestamp IS NOT NULL
GROUP BY is_autonomous;

-- Which model versions are in production?
SELECT model_version, COUNT(*) as commits
FROM agent_commits
WHERE deployment_timestamp IS NOT NULL
GROUP BY model_version;

-- How well are we reviewing agent code?
SELECT AVG(confidence_score) as avg_confidence, 
       COUNT(DISTINCT human_reviewer) as reviewers
FROM agent_commits
WHERE human_reviewer IS NOT NULL;