Terragrunt

Welcome! If you’re here, it means you’re interested in contributing to our infrastructure as code. We encourage and support your involvement. Should you have any questions or need further clarification, please don’t hesitate to reach out for assistance and training.

Folder structure
- Standards we use
Terragrunt configuration
- Consuming outputs from other stacks
Formatting
Naming
AWS authentication
Usage
Atlantis
Install
Secrets
- SOPS
Modules
- Documentation

Folder structure

Terragrunt projects should be organized using the following structure:

<repository root>
└── k8s-clusters                  # Cluster definitions and the YAML files they publish
    └── (cluster-name).yaml       # EksCluster CRD-like definition consumed by the cluster stack
    └── (cluster-name)-platform-info.yaml  # Auto-generated; published by the cluster stack on apply
└── terraform
  └── live                       # Contains the live representation of the infrastructure
      └── default                # Generic and cross-environment stacks
        └── bootstrap            # Base stack added via template
          └── (opentofu files)
          └── terragrunt.hcl     # Terragrunt configuration
        └── sso                  # AWS IAM/SSO configuration
          └── (opentofu files)
          └── terragrunt.hcl     # Terragrunt configuration
      └── production
        └── my-first-application # Application / workload stack
          └── terragrunt.hcl     # Terragrunt configuration
        └── my-other-application # Application / workload stack
          └── terragrunt.hcl     # Terragrunt configuration
        └── eks
          └── (cluster-name)
            └── addons
              └── terragrunt.hcl # Terragrunt configuration
            └── cluster
              └── terragrunt.hcl # Terragrunt configuration
            └── custom-addons
              └── terragrunt.hcl # Terragrunt configuration
        └── networking
          └── base
            └── terragrunt.hcl   # Terragrunt configuration
          └── peering
            └── terragrunt.hcl   # Terragrunt configuration
        └── route53
          └── (opentofu files)
          └── terragrunt.hcl     # Terragrunt configuration
        └── env.hcl              # Contains environment-specific Terragrunt config
      └── ...
        └── ...
      └── .gitignore
      └── aws_provider.hcl       # Terragrunt configuration that defines the common AWS opentofu provider
      └── eks_provider.hcl       # Terragrunt configuration that defines the common EKS opentofu provider
      └── terragrunt.hcl         # Main Terragrunt configuration
  └── modules
      └── my-first-application   # Application / workload module
        └── (opentofu files)
      └── my-other-application   # Application / workload module
        └── (opentofu files)
      └── networking-peering     # VPC peering module
        └── (opentofu files)

Standards we use

If code is re-used across multiple environments it should be made into a module. See also modules.
Don’t hardcode IDs, ARNs, or other values that are produced by another stack. Read them from the appropriate source — see Consuming outputs from other stacks.

Terragrunt configuration

Terragrunt configuration is defined in a terragrunt.hcl file. This uses the same HCL syntax as OpenTofu itself.

Example for a terragrunt file that creates an RDS in a Skyscrapers-managed VPC and EKS cluster:

include "root" {
  path = find_in_parent_folders("root.hcl")
}

include "env" {
  path   = find_in_parent_folders("env.hcl")
  expose = true
}

include "aws_provider" {
  path = find_in_parent_folders("aws_provider.hcl")
}

terraform {
  source = "${get_path_to_repo_root()}/terraform/modules/rds"
}

locals {
  platform_info = yamldecode(file("${get_repo_root()}/k8s-clusters/${include.env.inputs._eks_provider_cluster_name}-platform-info.yaml"))
}

inputs = {
  vpc_id                = local.platform_info.networking.vpcId
  db_subnet_ids         = local.platform_info.networking.privateDbSubnetIds
  cluster_workers_sg_id = local.platform_info.eksCluster.nodes.securityGroupId

  engine_mode    = "provisioned"
  engine_version = "5.7.mysql_aurora.2.11.1"
  instance_class = "db.r5.xlarge"
  instances      = { 1 = {} }
}

Consuming outputs from other stacks

There are three ways to read values produced by another stack — and they aren’t interchangeable. Pick the one that matches the kind of stack you’re reading from.

Cluster and networking metadata: platform-info

For VPC IDs, subnet IDs, EKS worker security groups, OIDC provider ARNs, NAT gateway IPs and the like — anything published by a Skyscrapers-managed eks-cluster or networking-stack apply — read from the *-platform-info.yaml files at the repo root. The cluster stack writes these on every apply, so the file in master is always in sync with reality.

locals {
  platform_info = yamldecode(file("${get_repo_root()}/k8s-clusters/${include.env.inputs._eks_provider_cluster_name}-platform-info.yaml"))
}

inputs = {
  vpc_id                = local.platform_info.networking.vpcId
  private_db_subnets    = local.platform_info.networking.privateDbSubnetIds
  cluster_workers_sg_id = local.platform_info.eksCluster.nodes.securityGroupId
}

Note

The include "env" block needs expose = true so the locals block can read include.env.inputs._eks_provider_cluster_name. The cluster name is set per-environment in <env>/env.hcl.

A non-exhaustive map of what’s available:

Use case	Path under `local.platform_info`
VPC ID	`networking.vpcId`
VPC CIDR	`networking.vpcCidrBlock`
Private app subnets	`networking.privateAppSubnetIds`
Private DB subnets	`networking.privateDbSubnetIds`
Public LB subnets	`networking.publicLbSubnetIds`
Private/public route tables	`networking.privateRouteTableIds` / `publicRouteTableIds`
NAT gateway IPs	`networking.natGatewayIps`
Availability zones	`networking.availabilityZones`
EKS worker SG ID	`eksCluster.nodes.securityGroupId`
EKS worker private subnets	`eksCluster.nodes.privateSubnetIds`
EKS OIDC provider ARN	`eksCluster.oidc.providerArn`
Cluster FQDN / endpoint / version	`eksCluster.fqdn` / `endpoint` / `version`

If a stack needs values from more than one cluster (a shared service that two clusters’ workers ingress to, for example), pass the file paths in via inputs and decode in TF code:

# terragrunt.hcl
inputs = {
  cluster_platform_info_files = [
    "${get_repo_root()}/k8s-clusters/development-eks-example.yaml",
    "${get_repo_root()}/k8s-clusters/production-eks-example.yaml",
  ]
}

# main.tf
locals {
  cluster_platform_info = [for path in var.cluster_platform_info_files : yamldecode(file(path))]
  worker_sg_ids         = [for c in local.cluster_platform_info : c.eksCluster.nodes.securityGroupId]
}

Customer-local stacks: terragrunt `dependency`

For stacks that live inside the same customer repo and produce outputs another stack needs — application-to-application wiring, an SES identity ARN consumed by a few apps, an OIDC provider, a Route53 zone ID, etc. — use a normal terragrunt dependency block:

dependency "ses" {
  config_path = "../ses"
}

inputs = {
  ses_identity_arn = dependency.ses.outputs.domain_identity_arn
}

This pattern requires the upstream stack’s source to be cloneable from the machine running terragrunt, so it’s the right choice only when the upstream is in your own repo. Don’t point a dependency at a Skyscrapers-managed stack — use platform-info instead.

Reading raw state: `terraform_remote_state`

When the upstream is a stack you operate yourself but you don’t want a hard dependency (typically because the upstream is consumed by many siblings and you want each consumer to evaluate independently, or because the upstream sources from a private repo you don’t want to clone every time you plan), read its state directly from S3.

This is purely an OpenTofu construct, so it goes in your .tf files, not in terragrunt.hcl:

data "terraform_remote_state" "shared_networking" {
  backend = "s3"

  config = {
    profile = var.tf_state_aws_profile
    region  = var.tf_state_aws_region
    bucket  = var.tf_state_bucket
    key     = "shared/networking/base"
  }
}

# Use:
# data.terraform_remote_state.shared_networking.outputs.vpc_id

tf_state_aws_profile, tf_state_aws_region and tf_state_bucket are already injected as terragrunt inputs from root.hcl — you only need to declare them as variables in your stack.

If you’re consuming a remote state from inside a module that’s used by several stacks, take a <thing>_state_key variable so each caller can point at its own upstream:

variable "networking_state_key" {
  description = "S3 state key of the networking/base stack to read VPC info from"
  type        = string
}

data "terraform_remote_state" "networking" {
  backend = "s3"
  config = {
    profile = var.tf_state_aws_profile
    region  = var.tf_state_aws_region
    bucket  = var.tf_state_bucket
    key     = var.networking_state_key
  }
}

# in the caller's terragrunt.hcl
inputs = {
  networking_state_key = "shared/networking/base"  # or "production/networking/base", etc.
}

Choosing between the three

Reading from…	Use
A Skyscrapers-managed cluster or networking stack (anything that publishes `*-platform-info.yaml`)	platform-info via `yamldecode(file(...))`
Another stack in your own customer repo (apps, SES, OIDC, Route53, etc.)	terragrunt `dependency` block
A stack you run yourself but where you don’t want a `dependency` constraint (e.g. shared-VPC infrastructure read by many consumers)	`data "terraform_remote_state"` in `.tf` code

Rule of thumb: if the upstream is not code you and your team can git clone and run, you must not use dependency against it — fall back to platform-info or terraform_remote_state.

Formatting

OpenTofu code should always be formatted by tofu fmt -recursive ./. This command will take care of all indentation, alignment, …

Note

Having the Terraform plugin installed also helps with this.

Variables and outputs should have a clear description what it is for and the expected format. For example:

variable "client_sg_ids" {
  description = "Security group IDs for client access to the Structr instance(s)"
  type        = list(string)
  default     = null
}

output "elb_dns_name" {
  description = "ELB DNS name for the frontend"
  value       = module.elb.elb_dns_name
}

Naming

Resources, variables and outputs should use _ as a separator.

Other than the general naming guidelines, OpenTofu resource names should:

be truncated automatically if they are longer than the maximum allowed length
not be suffixed with the type (eg. "aws_iam_role" "billing" vs "aws_iam_role" "billing_role") as this is redundant already with the resource type. This also let’s you keep names shorter, making it less likely to hit the character limit

And OpenTofu variables and outputs should:

end with the type they’re referring to, for example if the output is an instance ID, its name should be my_instance_id, not my_instance. This makes it much more clear what the actual output is.
be singular if they’re a single string or number, and plural if they’re a list. For example, if an output contains a list of instance IDs, its name should be my_instance_ids.

AWS authentication

To authenticate OpenTofu to AWS, we use a delegated access approach. Instead of accessing direclty an “ops” account with some set of credentials, we authenticate with an “admin” account and configure the OpenTofu AWS provider to assume an admin role in the target “ops” account. See the diagram below.

          1. User with
             access to
             admin account
                  +
                  |
                  |
                  v
            +-----+-----+
            |           |
            | OpenTofu  +-------------+
            |           |             |
            +---+-+-----+             |
                | |                   |
                | |                   |
                | |                   | Direct access to the
3. Assumed role | | 2. Assume         | OpenTofu state S3
   with temp.   | |    role in        | bucket and DynamoDB table
   credentials  | |    ops staging    |
      +---------+ |                   |
      |           v                   |
      |     +-----+------+            |
      |     |            |            |
      |     |   Admin    +<-----------+
      |     |   account  |
      |     |            |
      |     +------------+
      |
      v
+-----+------+          +------------+
|            |          |            |
|  Ops       |          | Ops        |
|  staging   |          | production |
|  account   |          | account    |
|            |          |            |
+------------+          +------------+

Each customer has an “admin” account and at least one “ops” account. The “admin” account is where the OpenTofu state is stored and where all the IAM users that need access to the infrastructure are created. The “ops” accounts are the ones containing the actual operational resources, like EC2 instances, load balancers, etc. Ideally, the “ops” accounts don’t have IAM users with direct access, instead there are multiple IAM roles with different set of capabilities, which can be assumed by users from the “admin” account.

Following a least privilege approach the user running OpenTofu should have a set of credentials configured to access the “admin” account, with just the following permissions:

access to the S3 bucket containing the OpenTofu state files
access to the DynamoDB table containing the OpenTofu state locks
permission to assume a more privileged role in the target “ops” accounts

These are some of its benefits:

we don’t have to manage and secure static credentials with direct admin access to each “ops” accounts.
the provided credentials from the assumed role last for just an hour, so it’s more difficult that they get compromised.

Usage

Terragrunt is a layer on top of OpenTofu. Therefore all the commands that can be used in OpenTofu can also be added to the terragrunt command. More info: https://terragrunt.gruntwork.io/docs/getting-started/quick-start/

Atlantis

Install

You can install OpenTofu and Terragrunt through the package manager that you use. More info:

Caution

We lock the version of OpenTofu in order not to accidentally apply breaking changes. This is done in the code in the required_providers section.

Caution

Make sure you don’t have Terraform installed on your machine. If you have, you can set TERRAGRUNT_TFPATH=tofu so Terragrunt will use the OpenTofu binary instead of the Terraform binary.

Secrets

All secrets such as passwords, certificates, … must be encrypted. You can do this using:

KMS, see the official docs how.
SOPS with KMS backend, See the official docs how

You can re-use the KMS key used for OpenTofu encryption documented in the customer’s documentation. Usually this key is created through Terragrunt in the general stack.

SOPS

In order to work with SOPS in Terragrunt stacks the following components are needed: A .sops.yaml file in corresponding environment folder (eg. terraform/live/production/.sops.yaml ) with the following config:

---
creation_rules:
  - kms: "arn:aws:kms:eu-west-1:123456789012:key/11111111-111-1111-1111-111111111111"
    role: arn:aws:iam::123456789012:role/ops/admin
    aws_profile: <customer>SharedTooling

The following lines need to be added to the terragrunt.hcl file where you want to load in these secrets:

locals {
  secret_vars = yamldecode(sops_decrypt_file("./secrets.yaml"))
}

inputs = merge({
  ...
}, local.secret_vars)

To create the sops secret file you can just run sops secrets.yaml in the folder where you want the secret to be saved. The content of this file is best structured as yaml.

Important

Don’t forget to add the secrets.yaml file to the .gitignore file so it can be taken up into the git repository.

Modules

If you can re-use a set of OpenTofu code, consider adding it as a module.

By default we try to use the upstream modules available in the Terraform and OpenTofu communities: https://github.com/opentofu/registry/tree/main?tab=readme-ov-file https://registry.terraform.io/browse/modules. In the case there is no upstream module available we also created some modules ourselves. You can find them all on GitHub: https://github.com/skyscrapers?utf8=%E2%9C%93&q=terraform-&type=&language=hcl.

Each module must have a README.md consisting of:

A description of what it does.
Which requirements does the module need.
Configuration parameter documentation (autogenerated).

Documentation

You should use terraform-docs to automatically generate a variable table from OpenTofu variables for use in documentation.

Use the following parameters:

terraform-docs markdown --sort-by required --escape=false <folder>

You can easily create a function for this which also copies the output to your clipboard. For example

tf-docs () { terraform-docs markdown --sort-by required --escape=false $1 | <your OS's clipboard> }

Last updated on May 12, 2026