Product teams ship fast. Infrastructure teams need to ensure that speed does not introduce configuration drift, unapproved changes, or compliance gaps. The traditional answer is to slow the process down with approvals and change control. The GitOps answer is to make the right path the fast path — by designing a workflow where compliant infrastructure changes are easy to make and non-compliant ones are caught automatically.
This guide covers the practical setup of a Terraform GitOps workflow with policy gates, drift detection, and promotion patterns that engineering teams can actually use.
The Core Model: Git as Infrastructure State
The foundational rule is simple: no infrastructure change may be applied to production without a corresponding change committed to the main branch of your infrastructure repository. This means:
- All Terraform configuration in version control, organised by environment (modules in
/modules, environment roots in/envs/production,/envs/staging) - Remote state stored in a shared backend (S3 + DynamoDB for AWS, GCS for GCP) with locking enabled
- No one has permissions to run
terraform applyfrom a local machine against production (enforce via IAM: CI service account has apply permissions; engineers have plan-only) - Production applies happen only in CI, triggered by a merge to main
The PR Workflow: Plan as a Review Artefact
The pull request is your change control process. Design the CI workflow to make the plan output immediately visible in the PR:
- On every PR, run
terraform planand post the output as a PR comment (Atlantis does this natively; GitHub Actions can do it with a script) - Require at least one reviewer to approve after seeing the plan — not just the code, but the actual resource diff
- Block merges if the plan has errors, if policy checks fail, or if the plan touches sensitive resources (production databases, IAM boundaries) without a designated infrastructure reviewer approval
- Include a cost estimate in the plan comment (Infracost integrates with GitHub Actions and most CI systems) so teams see the cost impact of infrastructure changes before they merge
Policy Gates with OPA or Sentinel
Policy as code lets you enforce infrastructure standards automatically without relying on reviewers to catch every issue. Common policies worth codifying:
- No public S3 buckets (enforce
block_public_acls = true) - All RDS instances must have
deletion_protection = truein production - All EC2 instances must use approved AMI IDs from your golden image list
- All resources must have required tags (environment, team, cost-centre)
- No security group rules allowing ingress from
0.0.0.0/0on database ports - All encryption settings must match your baseline (KMS-managed keys, not AWS-managed defaults)
Use Open Policy Agent (OPA) with Conftest for open-source policy enforcement in CI. Terraform Cloud and OpenTofu Cloud have Sentinel (or OPA) built in. Run policy checks before the plan is shown to reviewers — fail fast if policy is violated, before human review time is spent.
Environment Promotion Pattern
Treat infrastructure changes like application releases: test in staging before applying to production.
- Staging and production are separate Terraform workspaces with separate state files
- Changes merge to a
stagingbranch first; CI applies to staging automatically - After a defined soak period (or manual promotion decision), the staging branch is merged to
mainand CI applies to production - Use Terraform module versioning: staging can run a newer module version than production, making it easy to validate changes in a lower environment before promotion
Drift Detection as a Scheduled Job
Even with strict GitOps controls, drift can occur through emergency console changes, third-party integrations, or automatic resource updates. Catch it early:
- Run a daily scheduled CI job that executes
terraform planagainst production without applying. If the plan is non-empty (i.e. real state differs from code), post an alert to a Slack channel and create an issue in your infrastructure tracker. - Classify drift by severity: expected drift (auto-updated security group rules, certificate renewals) vs unexpected drift (resource configuration changed outside Terraform)
- Remediate unexpected drift by either updating the Terraform code to match (if the change was intentional) or reverting via a
terraform applyto restore the coded state (if the change was unintentional)
Secret Management Integration
Never store secrets in Terraform state or code. Integrate secret management from the start:
- Use
datasources to read secrets from AWS Secrets Manager, Vault, or GCP Secret Manager at apply time — not as hardcodedvariablevalues - Configure remote state encryption to protect any sensitive values that end up in state
- Rotate database passwords and API keys through your secrets manager, not through Terraform — Terraform tracks the secret ARN, not the value
Getting Started Without Starting Over
If your infrastructure is not in Terraform today, you do not need to migrate everything at once. A practical sequence:
- Start with new resources — all new infrastructure goes into Terraform from day one
- Import high-change resources first (security groups, IAM policies, load balancers)
- Import stable core resources later (VPCs, databases) once your workflow is established
- Use
terraform importwith a read-only plan review before any apply on imported resources
GitOps infrastructure practices are a core part of how we build reliable, auditable systems for clients across regulated and fast-moving industries. We implement this pattern as part of our Cloud and DevOps service. If you are modernising your infrastructure process, get in touch for a complimentary architecture review.
