We previously discussed using Terragrunt to manage your Terraform backend configuration. As a refresher:
- A backend controls where Terraform’s state is stored
- Terraform state maps resources created by Terraform to resource definitions in your
*.tf
files
The next couple of posts will continue exploring backends, this time with a focus on role-based access control (RBAC).
Terraform state is a sensitive resource. It is likely to contain secrets, including passwords and access tokens. Additionally, recovering from a lost state file means either recreating all the infrastructure it contained or spending some quality time running terraform import commands and hand modifying state files.
Chris, BTI360 engineer and author of thirstydeveloper.io, will walk us through how to give our state the protection it deserves using Amazon Web Services.
Controlling State Access with AWS
Our teams generally use the S3 backend, which stores state files as objects within an S3 bucket. When getting started, there are three access levels to consider for your state:
- Backend: A dedicated role Terraform will use when accessing and modifying state during operations performed by IAM users or CI/CD.
- Developer: Permissions needed for manual modifications/intervention by developers. Restricted from permanently deleting state.1
- Administrative: Has full access to state buckets and objects. Access should be highly restricted.
Today’s post will cover the first of the three. We will:
- Create a dedicated
TerraformBackend
IAM role for all state access when running Terraform - Instruct Terraform to use that role for all S3 backend access
The backend role will give us a foundation to build on with subsequent posts. Let’s get started.
A Chicken and Egg Problem
For creating the backend role, the first question we have to answer is: how are we going to deploy it? We’re all about infrastructure-as-code at BTI360, so manually creating it isn’t an option.
It’s awkward to create the backend role with Terraform itself because it introduces a chicken and egg problem. Terraform needs the role to access the backend, and it needs to access the backend to create the role. While it’s technically possible to deploy the role with Terraform, by initially storing the state with the local backend and then later changing the backend to S3, that approach feels a little inelegant.
We consider this backend role part of the “operational infrastructure” required to run Terraform and, as discussed in our previous post, we prefer to manage operational infrastructure with CloudFormation. Today, we’ll add a backend role to the stack we created last time.
If you care to jump to the end, here’s a link to the full CloudFormation template.
Backend Role CloudFormation Template
Previously, we added our state bucket, log bucket, and lock table to a CloudFormation stack. Here’s a reminder of what the CloudFormation template looked like:
--- AWSTemplateFormatVersion: '2010-09-09' Description: Deploy Terraform operational infrastructure Metadata: AWS::CloudFormation::Interface: ParameterGroups: - Label: default: Terraform State Resources Parameters: - StateBucketName - StateLogBucketName - LockTableName Parameters: StateBucketName: Type: String Description: Name of the S3 bucket for Terraform state StateLogBucketName: Type: String Description: Name of the S3 bucket for Terraform state logs LockTableName: Type: String Description: Name of the Terraform DynamoDB lock table Resources: TerraformStateLogBucket: Type: 'AWS::S3::Bucket' DeletionPolicy: Retain UpdateReplacePolicy: Retain Properties: BucketName: !Ref StateLogBucketName AccessControl: LogDeliveryWrite TerraformStateBucket: Type: 'AWS::S3::Bucket' DeletionPolicy: Retain UpdateReplacePolicy: Retain Properties: BucketName: !Ref StateBucketName BucketEncryption: ServerSideEncryptionConfiguration: - ServerSideEncryptionByDefault: SSEAlgorithm: aws:kms LoggingConfiguration: DestinationBucketName: !Ref StateLogBucketName LogFilePrefix: TFStateLogs/ PublicAccessBlockConfiguration: BlockPublicAcls: True BlockPublicPolicy: True IgnorePublicAcls: True RestrictPublicBuckets: True VersioningConfiguration: Status: Enabled TerraformStateLockTable: Type: 'AWS::DynamoDB::Table' DeletionPolicy: Retain UpdateReplacePolicy: Retain Properties: TableName: !Ref LockTableName AttributeDefinitions: - AttributeName: LockID AttributeType: S KeySchema: - AttributeName: LockID KeyType: HASH BillingMode: PAY_PER_REQUEST
The first change we’ll make is to add a resource to create an IAM managed policy for read-write state access. Add the following to the Resources
block:
TerraformStateReadWritePolicy: Type: 'AWS::IAM::ManagedPolicy' Properties: ManagedPolicyName: TerraformStateReadWrite Path: /terraform/ Description: Read/write access to Terraform state PolicyDocument: Version: 2012-10-17 # Permissions are based on: # https://www.terraform.io/docs/backends/types/s3.html#example-configuration # https://github.com/gruntwork-io/terragrunt/issues/919 Statement: - Sid: AllowStateBucketList Effect: Allow Action: - 's3:ListBucket' - 's3:GetBucketVersioning' Resource: !Sub "arn:aws:s3:::${StateBucketName}" - Sid: AllowStateReadWrite Effect: Allow Action: - 's3:GetObject' - 's3:PutObject' Resource: !Sub "arn:aws:s3:::${StateBucketName}/*" - Sid: AllowStateLockReadWrite Effect: Allow Action: - 'dynamodb:DescribeTable' - 'dynamodb:GetItem' - 'dynamodb:PutItem' - 'dynamodb:DeleteItem' Resource: !Sub "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${LockTableName}"
If you’re using Terragrunt instead of CloudFormation to create your state bucket, log bucket, and lock table, add two additional policy statements:
- Sid: AllowStateBucketCreation Effect: Allow Action: - 's3:GetBucketAcl' - 's3:GetBucketLogging' - 's3:CreateBucket' - 's3:PutBucketPublicAccessBlock' - 's3:PutBucketTagging' - 's3:PutBucketPolicy' - 's3:PutBucketVersioning' - 's3:PutEncryptionConfiguration' - 's3:PutBucketAcl' - 's3:PutBucketLogging' Resource: - !Sub "arn:aws:s3:::${StateBucketName}" - !Sub "arn:aws:s3:::${StateLogBucketName}" - Sid: AllowLockTableCreation Effect: Allow Action: - 'dynamodb:CreateTable' Resource: !Sub "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${LockTableName}"
This policy grants everything needed by the S3 backend to manage the state. It’s worth noting that Terragrunt requires additional permissions beyond what Terraform specifies.
In particular, this policy does not grant delete access to the Terraform state. As already discussed, losing your state file can create a tremendous amount of pain. Using a versioned S3 bucket helps, but you can still get into trouble if you delete the bucket itself. Since the backend role does not require delete access, it does not get it.
Next, add a second resource to create the backend role and attach the policy to it:
TerraformBackendRole: Type: 'AWS::IAM::Role' Properties: AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: AWS: !Ref AWS::AccountId Action: - 'sts:AssumeRole' Condition: StringEquals: aws:PrincipalType: User StringLike: 'aws:PrincipalTag/Terraformer': '*' RoleName: TerraformBackend Path: /terraform/ ManagedPolicyArns: - !Ref TerraformStateReadWritePolicy
In the AssumeRolePolicyDocument
, I’m specifying who can assume the backend role. There are several ways to go about this, depending on what type of principals are doing the assuming. You could use IAM groups if the principals are IAM users. You could use SAML context keys if dealing with federated users. We will use an attribute-based access control (ABAC) approach to grant access if the principal has a tag of Terraformer
. In this case, I’m assuming my principals are IAM users and am restricting access down to that principal type, but this same approach should work for other types as well.
That completes our CloudFormation template. The final result is available here.
Deploy your template as a CloudFormation stack either with the CloudFormation management console or the AWS CLI. Here’s a sample command for the latter:
aws cloudformation deploy \ --template-file terraform-bootstrap.cf.yml --stack-name terraform-bootstrap \ --capabilities CAPABILITY_NAMED_IAM \ --parameter-overrides \ StateBucketName=bti360-terraform-state \ StateLogBucketName=bti360-terraform-state-logs \ LockTableName=bti360-terraform-state-locks
Configuring Terragrunt
Next, we need to tell Terraform to use the backend role for remote state access. We previously covered how to add a remote_state
block to our Terragrunt config. To use our new role, we add the role_arn
property to that remote_state
block. Here’s an example:
remote_state { backend = "s3" generate = { path = "backend.tf" if_exists = "overwrite" } config = { bucket = "bti360-terraform-state" region = "us-east-1" encrypt = true role_arn = "arn:aws:iam::YOUR_ADMIN_ACCOUNT_ID:role/terraform/TerraformBackend" key = "${dirname(local.relative_deployment_path)}/${local.stack}.tfstate" dynamodb_table = "bti360-terraform-state-locks" accesslogging_bucket_name = "bti360-terraform-state-logs" } }
Re-run terragrunt init
for any affected root Terraform modules to switch them over to the new backend configuration, and finally try running terragrunt plan
and terragrunt apply
. Everything should work.
Conclusion
We now have a dedicated IAM role for the S3 backend to use. However, we still have to define both administrative and developer-level permissions and lock down access to the Terraform state. Be on the lookup for our next post, which will do just that.
In the meantime, if you care to see a fully worked example that incorporates the RBAC concepts introduced today, check out Chris’ terraform-skeleton series on thirstydeveloper.io.
Footnotes
- Versioning our state bucket enables reverting to previous state versions, further protecting write access.
Interested in Solving Challenging Problems? Work Here!
Are you a software engineer, interested in joining a software company that invests in its teammates and promotes a strong engineering culture? Then you’re in the right place! Check out our current Career Opportunities. We’re always looking for like-minded engineers to join the BTI360 family.