Managing Multiple EKS Clusters with Terraform
One cluster is easy, as the provider configuration is static, doesn't change. Once you start adding multiple clusters, your Terraform implementation becomes less DRY and messy. What I'm going to propose is better and makes your code a lot more flexible. What you can't do, is iterate over multiple providers. Why?
Why Iterating Over Providers Doesn't Work
As of 2024, providers can be instantiated with an alias. This allows you to have multiple static providers configured and pass the one you need into your module. The one key point here is that they are static, as they need to outlive any of the resources they create. Meaning, dynamic providers is dangerous in the sense that it can cause resources to dangle out in space on their own. Here's what the alias looks like
provider "aws" {
alias = "infrawest"
region = "us-west-2"
}
The alias is a string, but when you are ready to use this provider with alias, Terraform expects a variable reference. The variable must contain the instance of the provider.
module "example" {
source = "./example"
providers = {
aws = aws.infrawest
}
}
This makes it impossible to iterate, can't do it. To add onto this issue, the module itself requires allowed configuration aliases. It is also expecting a list of variables.
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 2.7.0"
configuration_aliases = [ aws.infrawest ]
}
}
}
So we can't iterate over providers and it is designed intentionally this way and doesn't appear to be changing any time soon. So what can we do?
Iterating Over Providers without Iterating Over Providers
I went down this rabbit hole because I wanted to stamp out EKS clusters for each VPC that was created with custom configurations and network setups. Anyhow, not important. Just know that I wanted to control the size of my VPC configuration and EKS clusters through a global configuration and iterate over it.
There are a few requirements to the approach I'm going to suggest:
Seems simple, and it is because the main.tf and providers.tf in the root module will not be static, but auto-generated from the "INFRA" module.
Auto Generating the APP Module Files
There are some up sides to this approach as it introduces a Separation of Concern between the infrastructure automation that provides the control plane services and the automation that interacts with the control plane services. I think you know where I'm going with this approach.
INFRA main.tf
Recommended by LinkedIn
Call to sub-module aws-eks
# Create EKS Cluster
# TODO Additional validation here to check for public subnet count
# if public_endpoint_access is true.
# TODO Create Security group for access from each VPC cidr if
# private endpoint is specified.
module "aws-eks" {
source = "../../modules/aws-eks"
depends_on = [module.aws-networking]
for_each = module.globals.env_eks_config[var.environment].vpcs
project_name = module.globals.env_global_config.project_name
environment = var.environment
vpc_name = each.key
vpc_id = module.aws-networking.vpcs[each.key].vpc_id
....
}
Sub-module: aws-eks outputs.tf
output "eks_cluster_name" {
description = "EKS Cluster Name"
value = aws_eks_cluster.eks_cluster.name
}
output "eks_vpc_name" {
description = "VPC Name"
value = var.vpc_name
}
output "eks_arn" {
description = "EKS Cluster ARN"
value = aws_eks_cluster.eks_cluster.arn
}
output "eks_id" {
description = "EKS Cluster ID"
value = aws_eks_cluster.eks_cluster.id
}
output "eks_endpoint" {
description = "EKS Cluster Endpoint"
value = aws_eks_cluster.eks_cluster.endpoint
}
output "eks_certificate_authority_data" {
description = "EKS Cluster Certificate Authority Data"
value = aws_eks_cluster.eks_cluster.certificate_authority.0.data
}
output "eks_oidc_issuer" {
description = "EKS Cluster OIDC Issuer"
value = aws_eks_cluster.eks_cluster.identity.0.oidc.0.issuer
}
output "eks_cluster_auth" {
description = "EKS Cluster Auth"
value = data.aws_eks_cluster_auth.default
}
The INFRA module iterates over my eks config and calls a child module called aws-eks that creates the cluster, node group, permissions, OIDC, and access config. The below config for EKS could have N number of EKS clusters but it corresponds with the VPC name "egress" which can be referenced in another config map for VPC configuration. This is just some background on how I'm iterating over the configs, not important to the approach, you can iterate however you like.
EKS Config
env_eks_config = {
test = {
vpcs = {
"egress" = {
authentication_mode = "API_AND_CONFIG_MAP"
private_endpoint_access = false
public_endpoint_access = true
disk_size = 20
instance_types = ["t3.small"]
ami_type = "AL2_x86_64"
capacity_type = "ON_DEMAND"
scaling_config = {
min_size = 1
max_size = 3
desired_size = 2
}
log_types = ["api", "audit"]
log_retention_in_days = 3
access_configuration = [
{
principal_arn = "current_user"
policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
access_scope = {
type = "cluster"
}
}
]
}
}
}
}
INFRA main.tf
Back to the main.tf in our INFRA root module, we now have a set of Kubernetes clusters available for our automation to interact with. The problem is that we need a provider config for each, and in the INFRA main.tf we iterate over these clusters and dynamically create the main.tf and provider.tf for our APP root module.
# Generate Provider File for Kubernetes and Helm
resource "local_file" "provider_file" {
content = <<EOT
%{for cluster in module.aws-eks~}
provider "kubernetes" {
host = "${cluster.eks_endpoint}"
alias = "${cluster.eks_vpc_name}"
cluster_ca_certificate = <<CONTENT
${base64decode(cluster.eks_certificate_authority_data)}
CONTENT
token = "${cluster.eks_cluster_auth.token}"
}
provider "helm" {
alias = "${cluster.eks_vpc_name}"
kubernetes {
host = "${cluster.eks_endpoint}"
cluster_ca_certificate = <<CONTENT
${base64decode(cluster.eks_certificate_authority_data)}
CONTENT
token = "${cluster.eks_cluster_auth.token}"
}
}
%{endfor~}
provider "aws" {
region = "${var.region}"
}
EOT
filename = "${path.cwd}/../${var.environment}-app/provider.tf"
}
# Generate main.tf File for
resource "local_file" "main_file" {
content = <<EOT
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.12"
}
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 3.0"
}
}
backend "s3" {
encrypt = true
bucket = "tf-dev-state-environment"
dynamodb_table = "tf-dev-state-locking"
key = "test-app-terraform.tfstate"
region = "us-east-2"
assume_role = {
role_arn = "arn:aws:iam::533267038789:role/tf-dev-state"
}
}
}
%{for cluster in module.aws-eks~}
# Standup App for ${cluster.eks_vpc_name}
module "${cluster.eks_vpc_name}-app" {
source = "../../modules/env-app"
environment = "${var.environment}"
region = "${var.region}"
vpc_name = "${cluster.eks_vpc_name}"
providers = {
kubernetes = kubernetes.${cluster.eks_vpc_name}
helm = helm.${cluster.eks_vpc_name}
}
}
%{endfor~}
EOT
filename = "${path.cwd}/../${var.environment}-app/main.tf"
}
This generates the main.tf and provider.tf for APP.
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.12"
}
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 3.0"
}
}
backend "s3" {
encrypt = true
bucket = "tf-dev-state-environment"
dynamodb_table = "tf-dev-state-locking"
key = "test-app-terraform.tfstate"
region = "us-east-2"
assume_role = {
role_arn = "arn:aws:iam::533267038789:role/tf-dev-state"
}
}
}
# Standup App for egress
module "egress-app" {
source = "../../modules/env-app"
environment = "test"
region = "us-east-2"
vpc_name = "egress"
providers = {
kubernetes = kubernetes.egress
helm = helm.egress
}
}
provider "kubernetes" {
host = "https://6A3B0F2954D..."
alias = "egress"
cluster_ca_certificate = <<CONTENT
-----BEGIN CERTIFICATE-----
MIIDBTCCAe2gAwIBAgIIAdjeuiJdrq8wDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE
AxMKa3ViZXJuZXRlczAeFw0yNDAyMTEwNDMwMDFaFw0zNDAyMDgwNDM1MDFaMBUx
EzARBgNVBAMTCmt1YmVybmV0ZXMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQC3F4eJ++woVqMUxSZ59MjzxAKZicyz/jgYxhh9pIRhjkOWaFKGZNXC22Yd
PguUEEwYMwQ6/CJq7k2eoOSnkzqRlcucetMpy0jQdbootqG2OWJ5ZnZOqblInjzA
Y/sytYzy4t/DCX9CHEdtV83P1oeOnZvyJg4W7XKtYMccWB7G4bLRc7KHjYq+q83K
xQYJqb8aqT1xt1l7+aYlKMK0iH6Y6jxO49+hxyLGAh5apzXrZda9G/9EC9IlifHf
...
-----END CERTIFICATE-----
CONTENT
token = "k8s-aws-v1.aHR0cHM6Ly9zdHMudXMtZWF..."
}
provider "helm" {
alias = "egress"
kubernetes {
host = "https://6A3B0F2954DEF86..."
cluster_ca_certificate = <<CONTENT
-----BEGIN CERTIFICATE-----
MIIDBTCCAe2gAwIBAgIIAdjeuiJdrq8wDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE
AxMKa3ViZXJuZXRlczAeFw0yNDAyMTEwNDMwMDFaFw0zNDAyMDgwNDM1MDFaMBUx
EzARBgNVBAMTCmt1YmVybmV0ZXMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
...
-----END CERTIFICATE-----
CONTENT
token = "k8s-aws-v1.aHR0cHM6Ly9zdHMudXMtZWFzdC0yLmFtYXpvbmF3cy5jb20vP0FjdGlvbj1HZXRDYWxsZXJJZGVudGl0eSZWZXJzaW9uPTIwMTEtMDYtMTUmWC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBWFlLSlJJSkNRRk5XQ0FLQ..."
}
}
provider "aws" {
region = "us-east-2"
}
I've removed a lot of the certificate and token information for obvious reasons but one thing to keep in mind is that these credentials do expire, so you will need to run the INFRA module again to generate the provider.tf file. You could separate this into its own module so that you can just target the credential generation and file generation without touching any other parts of the INFRA module. Well, hope this helps!