Setting up a Kubernetes Cluster in Amazon EKS using Terraform
If you check the AWS documentation, they use eksctl to create the EKS cluster. eksctl uses CloudFormation, and even if in the end, I could fetch the template, it feels like eksctl is an imperative way of creating an EKS Cluster. I prefer to keep track of all of my infrastructure as code, and using eksctl leaves an essential part of the infrastructure out of the codebase, the cluster itself.
I’ll describe how to create a Kubernetes cluster in Amazon EKS using Terraform in this article.
The eks_cluster resource is the one that creates the EKS cluster. It is a simple resource. Its required fields are name, role_arn and vpc_config.
name is the name of the cluster
role_arn is the ARN of the IAM role that the cluster will use
vpc_config includes the VPC ID and the subnets that the cluster will use
With that information, we should be able to create the cluster. In reality, if we set up the eks_cluster resource, we’ll have a Kubernetes cluster, but we still need to set up the infrastructure for the worker nodes. EKS manages the control plane, so once we’ve created the EKS cluster, AWS will handle the control plane for us, so we don’t have to worry about that. But we are still responsible for setting up the worker nodes. We’ll do that later. Let’s start by exploring what each of the fields for the eks_cluster requires (except name because naming is the hardest part of programming and is out of this article’s scope).
IAM role
A Kubernetes cluster needs to run on Nodes. Those nodes are EC2 instances, and they need to have permission to run the Kubernetes components. Those permissions are defined in an IAM role, and that role is attached to the EC2 instances that run the Kubernetes components.
The IAM role should allow the cluster to create those nodes (EC2 instances), create load balancers (Ingresses in k8s), manage auto scaling groups, etcetera. So we need to grant permissions to the EKS cluster to create, modify and update all the resources it needs. The good thing is that Amazon already made a policy we can use that handles the basics, arn:aws:iam::aws:policy/AmazonEKSClusterPolicy. You can see how the policy is defined if you log into your AWS console and search for it in the IAM section or by using the following command using the aws cli:
1
aws iam get-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy
From there, you’ll be able to see the DefaultVersionId. To view the default version of the policy, you can use the following command:
1
aws iam get-policy-version --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy --version-id v5
With the IAM role out of the way, let’s now look at the vpc_config.
VPC configuration and subnets
The VPC configuration has more moving parts. There are specific requirements that the VPC needs to meet for the cluster to work properly. The requirements are specified in the AWS Documentation - Amazon EKS VPC and subnet requirements and considerations. I won’t rewrite what is already explained in the documentation, but there are a few items worth noting:
The VPC must have at least two subnets that are in different availability zones
The VPC must have DNS hostname and DNS resolution support.
Those are the main ones. Other considerations include having enough IP addresses, etcetera, so check the documentation.
Let’s define the VPC and the subnets. We’ll use the aws_vpc and aws_subnet resources.
I won’t go into detail about how to plan the CIDR blocks and how to do your subnets, but I’ll show you the schema I’ll use:
VPC CIDR:
10.0.0.0/18
public_subnets = {
a = "10.0.0.0/22"
b = "10.0.4.0/22"
}
private_subnets = {
a = "10.0.8.0/22"
b = "10.0.12.0/22"
}
If you are looking for a subnet calculator, you can use mine: https://rdicidr.rderik.com/. The observant reader will notice that I like palindromes, and RDICIDR is one! it stands for RDerik Interactive CIDR.
data"aws_region" "current" {}
resource"aws_vpc" "main" {
cidr_block="10.0.0.0/18" # EKS requirements
# The VPC must have DNS hostname and DNS resolution support. Otherwise, nodes can't register to your cluster.
# https://docs.aws.amazon.com/eks/latest/userguide/network_reqs.html
enable_dns_hostnames=true enable_dns_support=true}
resource"aws_internet_gateway" "main" {
vpc_id=aws_vpc.main.id}
resource"aws_default_route_table" "public" {
default_route_table_id=aws_vpc.main.default_route_table_id route=[ {
cidr_block="0.0.0.0/0" gateway_id=aws_internet_gateway.main.id # these seem to be required due to an AWS provider bug
carrier_gateway_id="" destination_prefix_list_id="" egress_only_gateway_id="" instance_id="" ipv6_cidr_block="" local_gateway_id="" nat_gateway_id="" network_interface_id="" transit_gateway_id="" vpc_endpoint_id="" vpc_endpoint_id="" vpc_peering_connection_id="" }
]}
resource"aws_subnet" "public" {
for_each= {
a="10.0.0.0/22" b="10.0.4.0/22" }
vpc_id=aws_vpc.main.id availability_zone="${data.aws_region.current.name}${each.key}" cidr_block=each.value map_public_ip_on_launch=true}
resource"aws_route_table_association" "public" {
for_each=aws_subnet.public subnet_id=each.value.id route_table_id=aws_vpc.main.default_route_table_id}
resource"aws_eip" "main" {
for_each=aws_subnet.public vpc=true}
resource"aws_nat_gateway" "main" {
for_each=aws_subnet.public allocation_id=aws_eip.main[each.key].id subnet_id=each.value.id}
resource"aws_subnet" "private" {
for_each= {
a="10.0.8.0/22" b="10.0.12.0/22" }
vpc_id=aws_vpc.main.id availability_zone="${data.aws_region.current.name}${each.key}" cidr_block=each.value}
resource"aws_route_table" "private" {
for_each=aws_nat_gateway.main vpc_id=aws_vpc.main.id route=[ {
cidr_block="0.0.0.0/0" nat_gateway_id=each.value.id # these seem to be required due to an AWS provider bug
carrier_gateway_id="" destination_prefix_list_id="" egress_only_gateway_id="" gateway_id="" instance_id="" ipv6_cidr_block="" local_gateway_id="" network_interface_id="" transit_gateway_id="" vpc_endpoint_id="" vpc_peering_connection_id="" }
]}
resource"aws_route_table_association" "private" {
for_each=aws_subnet.private subnet_id=each.value.id route_table_id=aws_route_table.private[each.key].id}
We now have a VPC with two public and two private subnets. The public subnets have a route to the internet through an internet gateway, and the private subnets have a route to the internet through a NAT gateway.
EKS Cluster security group
For the VPC configuration, we will need to provide the security groups for the EKS cluster. We’ll use the aws_security_group resource. You will notice that in the following security group definition, we are referencing aws_securitygroup.eks_nodes.id, which hasn’t been created yet. We’ll create it later when we define the nodes. At the moment, assume that it’ll be defined later.
# EKS Cluster Security Group
resource"aws_security_group" "eks_cluster" {
name="eks-cluster-sg" description="Cluster communication with worker nodes" vpc_id=aws_vpc.main.id}
resource"aws_security_group_rule" "cluster_inbound" {
description="Allow worker nodes to communicate with the cluster API Server" from_port=443 protocol="tcp" security_group_id=aws_security_group.eks_cluster.id source_security_group_id=aws_security_group.eks_nodes.id to_port=443 type="ingress"}
resource"aws_security_group_rule" "cluster_outbound" {
description="Allow cluster API Server to communicate with the worker nodes" from_port=1024 protocol="tcp" security_group_id=aws_security_group.eks_cluster.id source_security_group_id=aws_security_group.eks_nodes.id to_port=65535 type="egress"}
Now let’s look at creating the nodes where the Kubernetes cluster will run.
Creating the worker nodes
We have a few different ways to set up the worker nodes. We could use two approaches:
Self-managed nodes - We would need to manage the nodes ourselves using Auto Scaling Groups and all that it entails.
Managed node groups - “Amazon EKS managed node groups automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for Amazon EKS Kubernetes clusters.” Amazon EKS nodes
We’ll use Managed node groups. The reason is that it’s easier to manage and the recommended way to do it.
Let’s start by creating the IAM role that the nodes will use. We’ll call it eks-node-role and attach the AmazonEKSWorkerNodePolicy and AmazonEKS_CNI_Policy policies to it.
If you plan to use the AWS Load Balancer controller, and you want to directly assign the policy to the nodes, we can do that now:
1
2
3
4
5
6
7
8
9
10
11
12
resource"aws_iam_policy" "alb_controller_policy" {
name="AlbControllerPolicy" # We are going to use the ALB controller implementation from the Kubernetes SIGs
# the following policy is needed
# Source: `curl -o alb_controller_policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.4.4/docs/install/iam_policy.json`
policy=file("${path.module}/alb_controller_policy.json")}
resource"aws_iam_role_policy_attachment" "node_alb_controller_policy" {
policy_arn=aws_iam_policy.alb_controller_policy.arn role=aws_iam_role.node.name}
Now we’ll create the security group for the nodes. We’ll call it eks-node-sg, and we’ll allow traffic from the security group of the control plane.
# EKS Node Security Group
resource"aws_security_group" "eks_nodes" {
name="eks-node-sg" description="Security group for all nodes in the cluster" vpc_id=aws_vpc.main.idegress {
from_port=0 to_port=0 protocol="-1" cidr_blocks=["0.0.0.0/0"] }
}
resource"aws_security_group_rule" "nodes_internal" {
description="Allow nodes to communicate with each other" from_port=0 protocol="-1" security_group_id=aws_security_group.eks_nodes.id source_security_group_id=aws_security_group.eks_nodes.id to_port=65535 type="ingress"}
resource"aws_security_group_rule" "nodes_cluster_inbound" {
description="Allow worker Kubelets and pods to receive communication from the cluster control plane" from_port=1025 protocol="tcp" security_group_id=aws_security_group.eks_nodes.id source_security_group_id=aws_security_group.eks_cluster.id to_port=65535 type="ingress"}
resource"aws_security_group_rule" "nodes_cluster_outbound" {
description="Allow worker Kubelets and pods to receive communication from the cluster control plane" from_port=1025 protocol="tcp" security_group_id=aws_security_group.eks_nodes.id source_security_group_id=aws_security_group.eks_cluster.id to_port=65535 type="egress"}
And we finally have everything to create our EKS cluster.
Creating the EKS cluster:
We’ll create the cluster using the eks_cluster resource. We’ll call it eks-cluster and use everything we created before. But we also want to log events in CloudWatch for the cluster, so let’s create the log group first:
Modify the region and cluster name if you choose different ones.
Final thoughts
Using Terraform to create the EKS cluster seems more convoluted than just typing a couple of lines with eksctl, but it has the benefit of describing what was deployed and having it as part of our Infrastructure as Code.
I hope this article was helpful and shed some light on how to set up your initial EKS cluster.
** If you want to check what else I'm currently doing, be sure to follow me on twitter @rderik or subscribe to the newsletter. If you want to send me a direct message, you can send it to derik@rderik.com.