08 Oct 2020

Building Cloud Infra Using Terraform - Part 1 (Concept of Modules)

Mohamed Muhannad

I’ve been using Terraform to deploy machines to our VMware clusters for a while now. But that has been just straightforward VM creation from a template. Nothing too fancy.

I wanted to learn more in-depth about how to really leverage Terraform, to create the necessary infrastructure components. I came across this concept of using modules to have a generic definition for a resource(s), that can be used by others to create the same type of resource(s) by passing in a few parameters.

The concept comes from software development practices where you create a module that does one thing and have other parts of the application call that module to perform a certain function. Similarly, by using Terraform modules, we can have a configuration call a module to create a group of resources with the parameters that we provide.

I wanted to give an overview of the concept and usage of modules in this post. There will definitely be more coming soon, where I go into detail with the actual implementation.

Scenario

To help understand the concept of modules better, let’s look at an example of deploying a simple web application to AWS. The diagram below should give the overall idea of what we are trying to build out. (Yes, it’s not a production grade setup, but for simplicity let’s go with this).

Explanation

For this web application to run we need the following resources:

An Application Load Balancer (ALB)
An Auto Scaling Group (ASG)
A Relational Database Service (MySQL RDS)
An S3 bucket (S3)

Approaches using Infra-as-Code with Terraform

Configuration in a single file

$:~/tf-infra-demo/single-config$ tree -a
.
├── main.tf
├── outputs.tf
└── variables.tf

0 directories, 3 files

At first glance, this looks like a simple configuration. A few 100s of lines in a single main.tf file should have us up and running in no time, right? Um, yeah.. it would get the job done. But…

What if you need to modify a parameter, would you be able to make those changes everywhere that parameter is required need? Would you remember to update any references that reflect that change? The chances for mistakes and stuff breaking is too high when you have a huge codebase.

After all, who would want to be the one that breaks the service because of a typo or misconfiguration? Right, no one. So, we should minimize the chances of making mistakes.

Configuration File per Type of Resource

Alright then. Let’s try to break down the configuration to manageable pieces. Based on the type of services required we can probably have a grouped structure like this:

An ALB configuration
An ASG configuration
A MySQL RDS configuration
An S3 Bucket

For a simple project this is a manageable approach. You can ensure each configuration is maintained separately and dependencies are handled.

But this approach quickly becomes a problem if you have different environments (dev, stage, prod) of your infrastructure. You would need to have multiple copies of your configuration. This might lead to inconsistencies between each configuration in each environment if you forgot to make the changes across every configuration.

$:~/tf-infra-demo/resource-type-config$ tree -a
.
└── live
    ├── prod
    │   ├── alb.tf
    │   ├── mysql-rds.tf
    │   ├── outputs.tf
    │   ├── s3.tf
    │   ├── variables.tf
    │   └── web-app-asg.tf
    └── stage
        ├── alb.tf
        ├── mysql-rds.tf
        ├── outputs.tf
        ├── s3.tf
        ├── variables.tf
        └── web-app-asg.tf

3 directories, 12 files

How could we make a configuration such that we pass in a few parameters and immediately get all the resources we want? Yes, using modules.

Configuration Files using Modules

We are finally at the real topic of this post. You might be wondering how using a module would help make things better. Let’s try this approach and see how it affects the usage of deploying infrastructure. We can think of it as each service is a module. In this example, we have 2 services:

A web app service (an ASG cluster with ALB and S3 bucket for storage)
A database service (a MySQL RDS)

Web App Module

The web-app service module would deploy the required web application into an ASG cluster with an ALB in front of it so that we can access it. It would also deploy the necessary S3 bucket to use for object storage.

To deploy this service module, we would only ask for a few variables:

Environment
Cluster configurations (min, max, desired number and type of instances, custom tags)

MySQL Database Module

The mysql-db service module would deploy an RDS MySQL instance that the “web-app” can use as a database.

To deploy service module, we would only ask to provide variables like:

Environment
Database configuration (DB name, user, password, service port)

Example of Deploying a “web-app” module.

It may not be clear as to how this would work out practically, so looking at an example should help. The directory structure for the project is as follows:

$:~/tf-infra-demo/modules-config$ tree -a
.
├── live
│   ├── prod
│   │   ├── main.tf
│   │   └── outputs.tf
│   └── stage
│       ├── main.tf
│       └── outputs.tf
└── modules
    ├── mysql-db
    │   ├── main.tf
    │   ├── outputs.tf
    │   └── variables.tf
    └── web-app
        ├── main.tf
        ├── outputs.tf
        └── variables.tf

6 directories, 10 files

Live folder contains the different deploy environments.
Modules folder contains the service modules.

Let’s take a look at how we used a module in the deploy configurations.

For our Web App service in stage environment, we have the following configuration.

terraform {
    required_version = ">= 0.13"
}

provider "aws" {
    region = "us-east-1"
}

module "web_app" {
    source = "../../modules/web-app"

    environment = "stage"

    instance_type = "t2.micro"
    min_size = 1
    max_size = 1
    desired_capacity = 1

    enable_autoscaling = false

    server_text = "Hello World Stage"

    # pass all output from db-dev to mysql_config
    mysql_config = data.terraform_remote_state.db_stage.outputs
}

For the same Web App service in prod environment, we have the following configuration.

terraform {
    required_version = ">= 0.13"
}

provider "aws" {
    region = "us-east-1"
}

module "web_app" {
    source = "../../modules/web-app"

    environment = "prod"

    instance_type = "m4.large"
    min_size = 2
    max_size = 6
    desired_capacity = 4

    enable_autoscaling = true

    server_text = "Hello World Prod"

    # pass all output from db-dev to mysql_config
    mysql_config = data.terraform_remote_state.db_prod.outputs
}

With only a single module resource block, and couple of parameters, we can deploy the web application service however we want.

Advantages

This type of abstraction helps us keep the deploy configurations simple and flexible. The developers wouldn’t have to think of how the actual implementation is done. They can independently maintain the deploy configurations for each environment.

Another, advantage is that we can update the modules and have a versioned release that ensures there won’t be breaking changes. This helps us to add new features to the modules (for example, better monitoring, autoscaling rules for the Web App module) and test them out separately without affecting actual deploy environments. Once they have been tested, we can upgrade the live environments as needed.

Caveats

I’ve found that there is a lot of planning involved when developing modules. Some of them are:

How do we handle dependencies and grouping of resources into a single module?
What are the configurable variables that need to be exposed?
How do we reference resources made by other modules?
How do we test a module to make sure it’s production ready?

Now, this might be a bit overkill for a simple project like this. But in larger projects, this approach would help keeps things simple in the long-term.

Conclusion

In summary, we have seen a few approaches on how we can develop Terraform configurations for a small project. Using modules enables us to have a reusable and configurable piece of the infrastructure that can be deployed to various environments.

The use of such abstractions can help keep things simple and error free. It also provides a faster way to get the infrastructure resources up and running.

In the next post (in this series), I plan to go into detail about the actual implementation of this concept.

References

This example architecture and concept was borrowed from the book Terraform: Up and Running by Yevgeniy Brikman. If you need a resource to understand and practice Terraform concepts and best practices, this is the one I recommend getting.

If you’d like to talk more about Terraform or any other tech, say hello on Twitter.