Terraform

March 21, 2025
in AWS, Terraform, DevOps, Career Development, Cloud Resume Challenge
6 min read

Cloud Resume Challenge with Terraform: Final Reflections & Future Directions 🎯

Journey Complete: What We've Built 🏗️

Over the course of this blog series, we've successfully completed the Cloud Resume Challenge using Terraform as our infrastructure-as-code tool. Let's recap what we've accomplished:

Set up our development environment with Terraform and AWS credentials
Deployed a static website using S3, CloudFront, Route 53, and ACM
Built a serverless backend API with API Gateway, Lambda, and DynamoDB
Implemented CI/CD pipelines with GitHub Actions for automated deployments
Added security enhancements like OIDC authentication and least-privilege IAM policies

The final architecture we've created looks like this:

Basic Project Diagram

The most valuable aspect of this project is that we've built a completely automated, production-quality cloud solution. Every component is defined as code, enabling us to track changes, rollback if needed, and redeploy the entire infrastructure with minimal effort.

Key Learnings from the Challenge 🧠

Technical Skills Gained 💻

Throughout this challenge, I've gained significant technical skills:

Terraform expertise: I've moved from basic understanding to writing modular, reusable infrastructure code
AWS service integration: Learned how multiple AWS services work together to create a cohesive system
CI/CD implementation: Set up professional GitHub Actions workflows for continuous deployment
Security best practices: Implemented OIDC, least privilege, encryption, and more
Serverless architecture: Built and connected serverless components for a scalable, cost-effective solution

Unexpected Challenges & Solutions 🔄

The journey wasn't without obstacles. Here are some challenges I faced and how I overcame them:

1. State Management Complexity

Challenge: As the project grew, managing Terraform state became more complex, especially when working across different environments.

Solution: I restructured the project to use workspaces and remote state with careful output references between modules. This improved state organization and made multi-environment deployments more manageable.

2. CloudFront Cache Invalidation

Challenge: Updates to the website weren't immediately visible due to CloudFront caching.

Solution: Implemented proper cache invalidation in the CI/CD pipeline and set appropriate cache behaviors for different file types.

3. CORS Configuration

Challenge: The frontend JavaScript couldn't connect to the API due to CORS issues.

Solution: Added comprehensive CORS handling at both the API Gateway and Lambda levels, ensuring proper headers were returned.

4. CI/CD Authentication Security

Challenge: Initially used long-lived AWS credentials in GitHub Secrets, which posed security risks.

Solution: Replaced with OIDC for keyless authentication between GitHub Actions and AWS, eliminating credential management concerns.

Real-World Applications of This Project 🌐

The skills demonstrated in this challenge directly translate to real-world cloud engineering roles:

1. Infrastructure as Code Expertise

The ability to define, version, and automate infrastructure is increasingly essential in modern IT environments. This project showcases expertise with Terraform that can be applied to any cloud provider or on-premises infrastructure.

2. DevOps Pipeline Creation

Setting up CI/CD workflows that automate testing and deployment demonstrates key DevOps skills that organizations need to accelerate their development cycles.

3. Serverless Architecture Design

The backend API implementation shows understanding of event-driven, serverless architecture patterns that are becoming standard for new cloud applications.

4. Security Implementation

The security considerations throughout the project - from IAM roles to OIDC authentication - demonstrate the ability to build secure systems from the ground up.

Maintaining Your Cloud Resume 🔧

Now that your resume is live, here are some tips for maintaining it:

1. Regular Updates

Set a schedule to update both your resume content and the underlying infrastructure. I recommend:

Monthly content refreshes to keep your experience and skills current
Quarterly infrastructure reviews to apply security patches and update dependencies
Annual architecture reviews to consider new AWS services or features

2. Cost Management

While this solution is relatively inexpensive, it's good practice to set up AWS Budgets and alerts to monitor costs. My current monthly costs are approximately:

S3: ~$0.10 for storage
CloudFront: ~$0.50 for data transfer
Route 53: $0.50 for hosted zone
Lambda: Free tier covers typical usage
DynamoDB: Free tier covers typical usage
API Gateway: ~$1.00 for API calls
Total: ~$2.10/month

3. Monitoring and Alerting

I've set up CloudWatch alarms for:

API errors exceeding normal thresholds
Unusual traffic patterns that might indicate abuse
Lambda function failures

Consider adding application performance monitoring tools like AWS X-Ray for deeper insights.

Future Enhancements 🚀

There are many ways to extend this project further:

1. Content Management System Integration

Add a headless CMS like Contentful or Sanity to make resume updates easier without needing to edit HTML directly:

module "contentful_integration" {
  source = "./modules/contentful"

  api_key     = var.contentful_api_key
  space_id    = var.contentful_space_id
  environment = var.environment
}

resource "aws_lambda_function" "content_sync" {
  function_name = "resume-content-sync-${var.environment}"
  handler       = "index.handler"
  runtime       = "nodejs14.x"
  role          = aws_iam_role.content_sync_role.arn

  environment {
    variables = {
      CONTENTFUL_API_KEY = var.contentful_api_key
      CONTENTFUL_SPACE_ID = var.contentful_space_id
      S3_BUCKET = module.frontend.website_bucket_name
    }
  }
}

2. Advanced Analytics

Implement sophisticated visitor analytics beyond simple counting:

resource "aws_kinesis_firehose_delivery_stream" "visitor_analytics" {
  name        = "resume-visitor-analytics-${var.environment}"
  destination = "extended_s3"

  extended_s3_configuration {
    role_arn   = aws_iam_role.firehose_role.arn
    bucket_arn = aws_s3_bucket.analytics.arn

    processing_configuration {
      enabled = "true"

      processors {
        type = "Lambda"

        parameters {
          parameter_name  = "LambdaArn"
          parameter_value = aws_lambda_function.analytics_processor.arn
        }
      }
    }
  }
}

resource "aws_athena_workgroup" "analytics" {
  name = "resume-analytics-${var.environment}"

  configuration {
    result_configuration {
      output_location = "s3://${aws_s3_bucket.analytics_results.bucket}/results/"
    }
  }
}

3. Multi-Region Deployment

Enhance reliability and performance by deploying to multiple AWS regions:

module "frontend_us_east_1" {
  source = "./modules/frontend"

  providers = {
    aws = aws.us_east_1
  }

  # Configuration for US East region
}

module "frontend_eu_west_1" {
  source = "./modules/frontend"

  providers = {
    aws = aws.eu_west_1
  }

  # Configuration for EU West region
}

resource "aws_route53_health_check" "primary_region" {
  fqdn              = module.frontend_us_east_1.cloudfront_domain_name
  port              = 443
  type              = "HTTPS"
  resource_path     = "/"
  failure_threshold = 3
  request_interval  = 30
}

resource "aws_route53_record" "global" {
  zone_id = data.aws_route53_zone.selected.zone_id
  name    = var.domain_name
  type    = "CNAME"

  failover_routing_policy {
    type = "PRIMARY"
  }

  health_check_id = aws_route53_health_check.primary_region.id
  set_identifier  = "primary"
  records         = [module.frontend_us_east_1.cloudfront_domain_name]
  ttl             = 300
}

4. Infrastructure Testing

Add comprehensive testing using Terratest:

package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestResumeFrontend(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../modules/frontend",
        Vars: map[string]interface{}{
            "environment": "test",
            "domain_name": "test.example.com",
        },
    })

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    // Verify outputs
    bucketName := terraform.Output(t, terraformOptions, "website_bucket_name")
    assert.Contains(t, bucketName, "resume-website-test")
}

Career Impact & Personal Growth 📈

Completing this challenge has had a significant impact on my career development:

Technical Growth

I've moved from basic cloud knowledge to being able to architect and implement complex, multi-service solutions. The hands-on experience with Terraform has been particularly valuable, as it's a highly sought-after skill in the job market.

Portfolio Enhancement

This project now serves as both my resume and a demonstration of my cloud engineering capabilities. I've included the GitHub repository links on my resume, allowing potential employers to see the code behind the deployment.

Community Engagement

Sharing this project through blog posts has connected me with the broader cloud community. The feedback and discussions have been invaluable for refining my approach and learning from others.

Final Thoughts 💭

The Cloud Resume Challenge has been an invaluable learning experience. By implementing it with Terraform, I've gained practical experience with both AWS services and infrastructure as code - skills that are directly applicable to professional cloud engineering roles.

What makes this challenge particularly powerful is how it combines so many aspects of modern cloud development:

Front-end web development
Back-end serverless APIs
Infrastructure as code
CI/CD automation
Security implementation
DNS configuration
Content delivery networks

If you're following along with this series, I encourage you to customize and extend the project to showcase your unique skills and interests. The foundational architecture we've built provides a flexible platform that can evolve with your career.

For those just starting their cloud journey, this challenge offers a perfect blend of practical skills in a realistic project that demonstrates end-to-end capabilities. It's far more valuable than isolated tutorials or theoretical knowledge alone.

The cloud engineering field continues to evolve rapidly, but the principles we've applied throughout this project - automation, security, scalability, and operational excellence - remain constants regardless of which specific technologies are in favor.

What's Next? 🔮

While this concludes our Cloud Resume Challenge series, my cloud learning journey continues. Some areas I'm exploring next include:

Kubernetes and container orchestration
Infrastructure testing frameworks
Cloud cost optimization
Multi-cloud deployments
Infrastructure security scanning
Service mesh implementations

I hope this series has been helpful in your own cloud journey. Feel free to reach out with questions or to share your own implementations of the challenge!

This post concludes our Cloud Resume Challenge with Terraform series. Thanks for following along!

Want to see the Cloud Resume Challenge in action? Visit my resume website and check out the GitHub repositories for the complete code.

Share on Share on

March 20, 2025
in AWS, Terraform, DevOps, CI/CD, Cloud Resume Challenge
14 min read

Cloud Resume Challenge with Terraform: Automating Deployments with GitHub Actions ⚡

In our previous posts, we built the frontend and backend components of our cloud resume project. Now it's time to take our implementation to the next level by implementing continuous integration and deployment (CI/CD) with GitHub Actions.

Why CI/CD Is Critical for Cloud Engineers 🛠️

When I first started this challenge, I manually ran terraform apply every time I made a change. This quickly became tedious and error-prone. As a cloud engineer, I wanted to demonstrate a professional approach to infrastructure management by implementing proper CI/CD pipelines.

Automating deployments offers several key benefits:

Consistency: Every deployment follows the same process
Efficiency: No more manual steps or waiting around
Safety: Automated tests catch issues before they reach production
Auditability: Each change is tracked with a commit and workflow run

This approach mirrors how professional cloud teams work and is a crucial skill for any cloud engineer.

CI/CD Architecture Overview 🏗️

Here's a visual representation of our CI/CD pipelines:

┌─────────────┐          ┌─────────────────┐          ┌─────────────┐
│             │          │                 │          │             │
│  Developer  ├─────────►│  GitHub Actions ├─────────►│  AWS Cloud  │
│  Workstation│          │                 │          │             │
└─────────────┘          └─────────────────┘          └─────────────┘
       │                          │                          ▲
       │                          │                          │
       ▼                          ▼                          │
┌─────────────┐          ┌─────────────────┐                 │
│             │          │                 │                 │
│   GitHub    │          │  Terraform      │                 │
│ Repositories│          │  Plan & Apply   ├─────────────────┘
│             │          │                 │
└─────────────┘          └─────────────────┘

We'll set up separate workflows for:

Frontend deployment: Updates the S3 website content and invalidates CloudFront
Backend deployment: Runs Terraform to update our API infrastructure
Smoke tests: Verifies that both components are working correctly after deployment

Setting Up GitHub Repositories 📁

For this challenge, I've created two repositories:

cloud-resume-frontend: Contains HTML, CSS, JavaScript, and frontend deployment workflows
cloud-resume-backend: Contains Terraform configuration, Lambda code, and backend deployment workflows

Repository Structure

Here's how I've organized my repositories:

Frontend Repository:

cloud-resume-frontend/
├── .github/
│   └── workflows/
│       └── deploy.yml
├── website/
│   ├── index.html
│   ├── styles.css
│   ├── counter.js
│   └── error.html
├── tests/
│   └── cypress/
│       └── integration/
│           └── counter.spec.js
└── README.md

Backend Repository:

cloud-resume-backend/
├── .github/
│   └── workflows/
│       └── deploy.yml
├── lambda/
│   └── visitor_counter.py
├── terraform/
│   ├── modules/
│   │   ├── backend/
│   │   │   ├── api_gateway.tf
│   │   │   ├── dynamodb.tf
│   │   │   ├── lambda.tf
│   │   │   ├── variables.tf
│   │   │   └── outputs.tf
│   ├── environments/
│   │   ├── dev/
│   │   │   └── main.tf
│   │   └── prod/
│   │       └── main.tf
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
├── tests/
│   └── test_visitor_counter.py
└── README.md

Securing AWS Authentication in GitHub Actions 🔒

Before setting up our workflows, we need to address a critical security concern: how to securely authenticate GitHub Actions with AWS.

In the past, many tutorials recommended storing AWS access keys as GitHub Secrets. This approach works but has significant security drawbacks:

Long-lived credentials are a security risk
Credential rotation is manual and error-prone
Access is typically overly permissive

Instead, I'll implement a more secure approach using OpenID Connect (OIDC) for keyless authentication between GitHub Actions and AWS.

Setting Up OIDC Authentication

First, create an IAM OIDC provider for GitHub in your AWS account:

# oidc-provider.tf
resource "aws_iam_openid_connect_provider" "github" {
  url             = "https://token.actions.githubusercontent.com"
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}

Then, create an IAM role that GitHub Actions can assume:

# oidc-role.tf
resource "aws_iam_role" "github_actions" {
  name = "github-actions-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRoleWithWebIdentity"
        Effect = "Allow"
        Principal = {
          Federated = aws_iam_openid_connect_provider.github.arn
        }
        Condition = {
          StringEquals = {
            "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
          }
          StringLike = {
            "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:*"
          }
        }
      }
    ]
  })
}

# Attach policies to the role
resource "aws_iam_role_policy_attachment" "terraform_permissions" {
  role       = aws_iam_role.github_actions.name
  policy_arn = aws_iam_policy.terraform_permissions.arn
}

resource "aws_iam_policy" "terraform_permissions" {
  name        = "terraform-deployment-policy"
  description = "Policy for Terraform deployments via GitHub Actions"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "s3:*",
          "cloudfront:*",
          "route53:*",
          "acm:*",
          "lambda:*",
          "apigateway:*",
          "dynamodb:*",
          "logs:*",
          "iam:GetRole",
          "iam:PassRole",
          "iam:CreateRole",
          "iam:DeleteRole",
          "iam:PutRolePolicy",
          "iam:DeleteRolePolicy",
          "iam:AttachRolePolicy",
          "iam:DetachRolePolicy"
        ]
        Effect   = "Allow"
        Resource = "*"
      }
    ]
  })
}

For a production environment, I would use more fine-grained permissions, but this policy works for our demonstration.

Implementing Frontend CI/CD Workflow 🔄

Let's create a GitHub Actions workflow for our frontend repository. Create a file at .github/workflows/deploy.yml:

name: Deploy Frontend

on:
  push:
    branches:
      - main
    paths:
      - 'website/**'
      - '.github/workflows/deploy.yml'

  workflow_dispatch:

permissions:
  id-token: write
  contents: read

jobs:
  deploy:
    name: 'Deploy to S3 and Invalidate CloudFront'
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
          aws-region: us-east-1

      - name: Deploy to S3
        run: |
          aws s3 sync website/ s3://${{ secrets.S3_BUCKET_NAME }} --delete

      - name: Invalidate CloudFront Cache
        run: |
          aws cloudfront create-invalidation --distribution-id ${{ secrets.CLOUDFRONT_DISTRIBUTION_ID }} --paths "/*"

  test:
    name: 'Run Smoke Tests'
    needs: deploy
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Install Cypress
        uses: cypress-io/github-action@v5
        with:
          install-command: npm install

      - name: Run Cypress Tests
        uses: cypress-io/github-action@v5
        with:
          command: npx cypress run
          config: baseUrl=${{ secrets.WEBSITE_URL }}

This workflow:

Authenticates using OIDC
Syncs website files to the S3 bucket
Invalidates the CloudFront cache
Runs Cypress tests to verify the site is working

Creating a Cypress Test for the Frontend

Let's create a simple Cypress test to verify that our visitor counter is working. First, create a package.json file in the root of your frontend repository:

{
  "name": "cloud-resume-frontend",
  "version": "1.0.0",
  "description": "Frontend for Cloud Resume Challenge",
  "scripts": {
    "test": "cypress open",
    "test:ci": "cypress run"
  },
  "devDependencies": {
    "cypress": "^12.0.0"
  }
}

Then create a Cypress test at tests/cypress/integration/counter.spec.js:

describe('Resume Website Tests', () => {
  beforeEach(() => {
    // Visit the home page before each test
    cy.visit('/');
  });

  it('should load the resume page', () => {
    // Check that we have a title
    cy.get('h1').should('be.visible');

    // Check that key sections exist
    cy.contains('Experience').should('be.visible');
    cy.contains('Education').should('be.visible');
    cy.contains('Skills').should('be.visible');
  });

  it('should load and display the visitor counter', () => {
    // Check that the counter element exists
    cy.get('#count').should('exist');

    // Wait for the counter to update (should not remain at 0)
    cy.get('#count', { timeout: 10000 })
      .should('not.contain', '0')
      .should('not.contain', 'Loading');

    // Verify the counter shows a number
    cy.get('#count').invoke('text').then(parseFloat)
      .should('be.gt', 0);
  });
});

Implementing Backend CI/CD Workflow 🔄

Now, let's create a GitHub Actions workflow for our backend repository. Create a file at .github/workflows/deploy.yml:

name: Deploy Backend

on:
  push:
    branches:
      - main
    paths:
      - 'lambda/**'
      - 'terraform/**'
      - '.github/workflows/deploy.yml'

  pull_request:
    branches:
      - main

  workflow_dispatch:

permissions:
  id-token: write
  contents: read
  pull-requests: write

jobs:
  test:
    name: 'Run Python Tests'
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'

      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install pytest boto3 moto

      - name: Run Tests
        run: |
          python -m pytest tests/

  validate:
    name: 'Validate Terraform'
    runs-on: ubuntu-latest
    needs: test

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.2.0

      - name: Terraform Format
        working-directory: ./terraform
        run: terraform fmt -check

      - name: Terraform Init
        working-directory: ./terraform
        run: terraform init -backend=false

      - name: Terraform Validate
        working-directory: ./terraform
        run: terraform validate

  plan:
    name: 'Terraform Plan'
    runs-on: ubuntu-latest
    needs: validate
    if: github.event_name == 'pull_request' || github.event_name == 'push' || github.event_name == 'workflow_dispatch'
    environment: dev

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.2.0

      - name: Terraform Init
        working-directory: ./terraform
        run: terraform init -backend-config="bucket=${{ secrets.TF_STATE_BUCKET }}" -backend-config="key=${{ secrets.TF_STATE_KEY }}" -backend-config="region=us-east-1"

      - name: Terraform Plan
        working-directory: ./terraform
        run: terraform plan -var="environment=dev" -var="domain_name=${{ secrets.DOMAIN_NAME }}" -out=tfplan

      - name: Comment Plan on PR
        uses: actions/github-script@v6
        if: github.event_name == 'pull_request'
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            const output = `#### Terraform Format and Style 🖌\`${{ steps.fmt.outcome }}\`
            #### Terraform Plan 📖\`${{ steps.plan.outcome }}\`

            <details><summary>Show Plan</summary>

            \`\`\`terraform
            ${{ steps.plan.outputs.stdout }}
            \`\`\`

            </details>`;
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: output
            })

      - name: Upload Plan Artifact
        uses: actions/upload-artifact@v3
        with:
          name: tfplan
          path: ./terraform/tfplan

  apply:
    name: 'Terraform Apply'
    runs-on: ubuntu-latest
    needs: plan
    if: github.event_name == 'push' && github.ref == 'refs/heads/main' || github.event_name == 'workflow_dispatch'
    environment: dev

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.2.0

      - name: Terraform Init
        working-directory: ./terraform
        run: terraform init -backend-config="bucket=${{ secrets.TF_STATE_BUCKET }}" -backend-config="key=${{ secrets.TF_STATE_KEY }}" -backend-config="region=us-east-1"

      - name: Download Plan Artifact
        uses: actions/download-artifact@v3
        with:
          name: tfplan
          path: ./terraform

      - name: Terraform Apply
        working-directory: ./terraform
        run: terraform apply -auto-approve tfplan

  test-api:
    name: 'Test API Deployment'
    runs-on: ubuntu-latest
    needs: apply
    environment: dev

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
          aws-region: us-east-1

      - name: Fetch API Endpoint
        run: |
          API_ENDPOINT=$(aws cloudformation describe-stacks --stack-name resume-backend-dev --query "Stacks[0].Outputs[?OutputKey=='ApiEndpoint'].OutputValue" --output text)
          echo "API_ENDPOINT=$API_ENDPOINT" >> $GITHUB_ENV

      - name: Test API Response
        run: |
          response=$(curl -s "$API_ENDPOINT/count")
          echo "API Response: $response"

          # Check if the response contains a count field
          echo $response | grep -q '"count":'
          if [ $? -eq 0 ]; then
            echo "API test successful"
          else
            echo "API test failed"
            exit 1
          fi

This workflow is more complex and includes:

Running Python tests for the Lambda function
Validating Terraform syntax and formatting
Planning Terraform changes (with PR comments for review)
Applying Terraform changes to the environment
Testing the deployed API to ensure it's functioning

Implementing Multi-Environment Deployments 🌍

One of the most valuable CI/CD patterns is deploying to multiple environments. Let's modify our backend workflow to support both development and production environments:

# Additional job for production deployment after dev is successful
  promote-to-prod:
    name: 'Promote to Production'
    runs-on: ubuntu-latest
    needs: test-api
    environment: production
    if: github.event_name == 'workflow_dispatch'

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.2.0

      - name: Terraform Init
        working-directory: ./terraform/environments/prod
        run: terraform init -backend-config="bucket=${{ secrets.TF_STATE_BUCKET }}" -backend-config="key=${{ secrets.TF_STATE_KEY_PROD }}" -backend-config="region=us-east-1"

      - name: Terraform Plan
        working-directory: ./terraform/environments/prod
        run: terraform plan -var="environment=prod" -var="domain_name=${{ secrets.DOMAIN_NAME_PROD }}" -out=tfplan

      - name: Terraform Apply
        working-directory: ./terraform/environments/prod
        run: terraform apply -auto-approve tfplan

      - name: Test Production API
        run: |
          API_ENDPOINT=$(aws cloudformation describe-stacks --stack-name resume-backend-prod --query "Stacks[0].Outputs[?OutputKey=='ApiEndpoint'].OutputValue" --output text)
          response=$(curl -s "$API_ENDPOINT/count")
          echo "API Response: $response"

          # Check if the response contains a count field
          echo $response | grep -q '"count":'
          if [ $? -eq 0 ]; then
            echo "Production API test successful"
          else
            echo "Production API test failed"
            exit 1
          fi

Terraform Structure for Multiple Environments

To support multiple environments, I've reorganized my Terraform configuration:

terraform/
├── modules/
│   ├── backend/
│   │   ├── api_gateway.tf
│   │   ├── dynamodb.tf
│   │   ├── lambda.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── prod/
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf

Each environment directory contains its own Terraform configuration that references the shared modules.

Implementing GitHub Security Best Practices 🔒

To enhance the security of our CI/CD pipelines, I've implemented several additional measures:

1. Supply Chain Security with Dependabot

Create a file at .github/dependabot.yml in both repositories:

version: 2
updates:
  - package-ecosystem: "github-actions"
    directory: "/"
    schedule:
      interval: "weekly"
    open-pull-requests-limit: 10

  # For frontend
  - package-ecosystem: "npm"
    directory: "/"
    schedule:
      interval: "weekly"
    open-pull-requests-limit: 10

  # For backend
  - package-ecosystem: "pip"
    directory: "/"
    schedule:
      interval: "weekly"
    open-pull-requests-limit: 10

This configuration automatically updates dependencies and identifies security vulnerabilities.

2. Code Scanning with CodeQL

Create a file at .github/workflows/codeql.yml in the backend repository:

name: "CodeQL"

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
  schedule:
    - cron: '0 0 * * 0'  # Run weekly

jobs:
  analyze:
    name: Analyze
    runs-on: ubuntu-latest
    permissions:
      actions: read
      contents: read
      security-events: write

    strategy:
      fail-fast: false
      matrix:
        language: [ 'python', 'javascript' ]

    steps:
    - name: Checkout repository
      uses: actions/checkout@v3

    - name: Initialize CodeQL
      uses: github/codeql-action/init@v2
      with:
        languages: ${{ matrix.language }}

    - name: Perform CodeQL Analysis
      uses: github/codeql-action/analyze@v2

This workflow scans our code for security vulnerabilities and coding problems.

3. Branch Protection Rules

I've set up branch protection rules for the main branch in both repositories:

Require pull request reviews before merging
Require status checks to pass before merging
Require signed commits
Do not allow bypassing the above settings

Adding Verification Tests to the Workflow 🧪

In addition to unit tests, I've added end-to-end integration tests to verify that the frontend and backend work together correctly:

1. Frontend-Backend Integration Test

Create a file at tests/integration-test.js in the frontend repository:

const axios = require('axios');
const assert = require('assert');

// URLs to test - these should be passed as environment variables
const WEBSITE_URL = process.env.WEBSITE_URL || 'https://resume.yourdomain.com';
const API_URL = process.env.API_URL || 'https://api.yourdomain.com/count';

// Test that the API returns a valid response
async function testAPI() {
  try {
    console.log(`Testing API at ${API_URL}`);
    const response = await axios.get(API_URL);

    // Verify the API response contains a count
    assert(response.status === 200, `API returned status ${response.status}`);
    assert(response.data.count !== undefined, 'API response missing count field');
    assert(typeof response.data.count === 'number', 'Count is not a number');

    console.log(`API test successful. Count: ${response.data.count}`);
    return true;
  } catch (error) {
    console.error('API test failed:', error.message);
    return false;
  }
}

// Test that the website loads and contains necessary elements
async function testWebsite() {
  try {
    console.log(`Testing website at ${WEBSITE_URL}`);
    const response = await axios.get(WEBSITE_URL);

    // Verify the website loads
    assert(response.status === 200, `Website returned status ${response.status}`);

    // Check that the page contains some expected content
    assert(response.data.includes('<html'), 'Response is not HTML');
    assert(response.data.includes('id="count"'), 'Counter element not found');

    console.log('Website test successful');
    return true;
  } catch (error) {
    console.error('Website test failed:', error.message);
    return false;
  }
}

// Run all tests
async function runTests() {
  const apiResult = await testAPI();
  const websiteResult = await testWebsite();

  if (apiResult && websiteResult) {
    console.log('All integration tests passed!');
    process.exit(0);
  } else {
    console.error('Some integration tests failed');
    process.exit(1);
  }
}

// Run the tests
runTests();

Then add a step to the workflow:

- name: Run Integration Tests
  run: |
    npm install axios
    node tests/integration-test.js
  env:
    WEBSITE_URL: ${{ secrets.WEBSITE_URL }}
    API_URL: ${{ secrets.API_URL }}

Implementing Secure GitHub Action Secrets 🔐

For our GitHub Actions workflows, I've set up the following repository secrets:

AWS_ACCOUNT_ID: The AWS account ID used for OIDC authentication
S3_BUCKET_NAME: The name of the S3 bucket for the website
CLOUDFRONT_DISTRIBUTION_ID: The ID of the CloudFront distribution
WEBSITE_URL: The URL of the deployed website
API_URL: The URL of the deployed API
TF_STATE_BUCKET: The bucket for Terraform state
TF_STATE_KEY: The key for Terraform state (dev)
TF_STATE_KEY_PROD: The key for Terraform state (prod)
DOMAIN_NAME: The domain name for the dev environment
DOMAIN_NAME_PROD: The domain name for the prod environment

These secrets are protected by GitHub and only exposed to authorized workflow runs.

Managing Manual Approvals for Production Deployments 🚦

For production deployments, I've added a manual approval step using GitHub Environments:

Go to your repository settings
Navigate to Environments
Create a new environment called "production"
Enable "Required reviewers" and add yourself
Configure "Deployment branches" to limit deployments to specific branches

Now, production deployments will require explicit approval from an authorized reviewer.

Monitoring Deployment Status and Notifications 📊

To stay informed about deployment status, I've added notifications to the workflow:

- name: Notify Deployment Success
  if: success()
  uses: rtCamp/action-slack-notify@v2
  env:
    SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
    SLACK_TITLE: Deployment Successful
    SLACK_MESSAGE: "✅ Deployment to ${{ github.workflow }} was successful!"
    SLACK_COLOR: good

- name: Notify Deployment Failure
  if: failure()
  uses: rtCamp/action-slack-notify@v2
  env:
    SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
    SLACK_TITLE: Deployment Failed
    SLACK_MESSAGE: "❌ Deployment to ${{ github.workflow }} failed!"
    SLACK_COLOR: danger

This sends notifications to a Slack channel when deployments succeed or fail.

Implementing Additional Security for AWS CloudFront 🔒

To enhance the security of our CloudFront distribution, I've added a custom response headers policy:

resource "aws_cloudfront_response_headers_policy" "security_headers" {
  name = "security-headers-policy"

  security_headers_config {
    content_security_policy {
      content_security_policy = "default-src 'self'; img-src 'self'; script-src 'self'; style-src 'self'; object-src 'none';"
      override = true
    }

    content_type_options {
      override = true
    }

    frame_options {
      frame_option = "DENY"
      override = true
    }

    referrer_policy {
      referrer_policy = "same-origin"
      override = true
    }

    strict_transport_security {
      access_control_max_age_sec = 31536000
      include_subdomains = true
      preload = true
      override = true
    }

    xss_protection {
      mode_block = true
      protection = true
      override = true
    }
  }
}

Then reference this policy in the CloudFront distribution:

resource "aws_cloudfront_distribution" "website" {
  # ... other configuration ...

  default_cache_behavior {
    # ... other configuration ...
    response_headers_policy_id = aws_cloudfront_response_headers_policy.security_headers.id
  }
}

Lessons Learned 💡

Implementing CI/CD for this project taught me several valuable lessons:

Start Simple, Then Iterate: My first workflow was basic - just syncing files to S3. As I gained confidence, I added testing, multiple environments, and security features.
Security Is Non-Negotiable: Using OIDC for authentication instead of long-lived credentials was a game-changer for security. This approach follows AWS best practices and eliminates credential management headaches.
Test Everything: Automated tests at every level (unit, integration, end-to-end) catch issues early. The time invested in writing tests paid off with more reliable deployments.
Environment Separation: Keeping development and production environments separate allowed me to test changes safely before affecting the live site.
Infrastructure as Code Works: Using Terraform to define all infrastructure components made the CI/CD process much more reliable. Everything is tracked, versioned, and repeatable.

My Integration Challenges and Solutions 🧩

During implementation, I encountered several challenges:

CORS Issues: The API and website needed proper CORS configuration to work together. Adding the correct headers in both Lambda and API Gateway fixed this.
Environment Variables: Managing different configurations for dev and prod was tricky. I solved this by using GitHub environment variables and separate Terraform workspaces.
Cache Invalidation Delays: Changes to the website sometimes weren't visible immediately due to CloudFront caching. Adding proper cache invalidation to the workflow fixed this.
State Locking: When multiple workflow runs executed simultaneously, they occasionally conflicted on Terraform state. Using DynamoDB for state locking resolved this issue.

DevOps Mod: Multi-Stage Pipeline with Pull Request Environments 🚀

To extend this challenge further, I implemented a feature that creates temporary preview environments for pull requests:

  create_preview:
    name: 'Create Preview Environment'
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v2
        with:
          role-to-assume: arn:aws:iam::${{ secrets.AWS_ACCOUNT_ID }}:role/github-actions-role
          aws-region: us-east-1

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.2.0

      - name: Generate Unique Environment Name
        run: |
          PR_NUMBER=${{ github.event.pull_request.number }}
          BRANCH_NAME=$(echo ${{ github.head_ref }} | tr -cd '[:alnum:]' | tr '[:upper:]' '[:lower:]')
          ENV_NAME="pr-${PR_NUMBER}-${BRANCH_NAME}"
          echo "ENV_NAME=${ENV_NAME}" >> $GITHUB_ENV

      - name: Terraform Init
        working-directory: ./terraform
        run: terraform init -backend-config="bucket=${{ secrets.TF_STATE_BUCKET }}" -backend-config="key=preview/${{ env.ENV_NAME }}/terraform.tfstate" -backend-config="region=us-east-1"

      - name: Terraform Apply
        working-directory: ./terraform
        run: |
          terraform apply -auto-approve \
            -var="environment=${{ env.ENV_NAME }}" \
            -var="domain_name=pr-${{ github.event.pull_request.number }}.${{ secrets.DOMAIN_NAME }}"

      - name: Comment Preview URL
        uses: actions/github-script@v6
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            const output = `## 🚀 Preview Environment Deployed

            Preview URL: https://pr-${{ github.event.pull_request.number }}.${{ secrets.DOMAIN_NAME }}

            API Endpoint: https://api-pr-${{ github.event.pull_request.number }}.${{ secrets.DOMAIN_NAME }}/count

            This environment will be automatically deleted when the PR is closed.`;

            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: output
            })

And add a cleanup job to delete the preview environment when the PR is closed:

  cleanup_preview:
    name: 'Cleanup Preview Environment'
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request' && github.event.action == 'closed'

    steps:
      # Similar to create_preview but with terraform destroy

Security Mod: Implementing AWS Secrets Manager for API Keys 🔐

To enhance the security of our API, I added API key authentication using AWS Secrets Manager:

# Create a secret to store the API key
resource "aws_secretsmanager_secret" "api_key" {
  name        = "resume-api-key-${var.environment}"
  description = "API key for the Resume API"
}

# Generate a random API key
resource "random_password" "api_key" {
  length  = 32
  special = false
}

# Store the API key in Secrets Manager
resource "aws_secretsmanager_secret_version" "api_key" {
  secret_id     = aws_secretsmanager_secret.api_key.id
  secret_string = random_password.api_key.result
}

# Add API key to API Gateway
resource "aws_api_gateway_api_key" "visitor_counter" {
  name = "visitor-counter-key-${var.environment}"
}

resource "aws_api_gateway_usage_plan" "visitor_counter" {
  name = "visitor-counter-usage-plan-${var.environment}"

  api_stages {
    api_id = aws_api_gateway_rest_api.visitor_counter.id
    stage  = aws_api_gateway_deployment.visitor_counter.stage_name
  }

  quota_settings {
    limit  = 1000
    period = "DAY"
  }

  throttle_settings {
    burst_limit = 10
    rate_limit  = 5
  }
}

resource "aws_api_gateway_usage_plan_key" "visitor_counter" {
  key_id        = aws_api_gateway_api_key.visitor_counter.id
  key_type      = "API_KEY"
  usage_plan_id = aws_api_gateway_usage_plan.visitor_counter.id
}

# Update the Lambda function to verify the API key
resource "aws_lambda_function" "visitor_counter" {
  # ... existing configuration ...

  environment {
    variables = {
      DYNAMODB_TABLE = aws_dynamodb_table.visitor_counter.name
      ALLOWED_ORIGIN = var.website_domain
      API_KEY_SECRET = aws_secretsmanager_secret.api_key.name
    }
  }
}

Then, modify the Lambda function to retrieve and validate the API key:

import boto3
import json
import os

# Initialize Secrets Manager client
secretsmanager = boto3.client('secretsmanager')

def get_api_key():
    """Retrieve the API key from Secrets Manager"""
    secret_name = os.environ['API_KEY_SECRET']
    response = secretsmanager.get_secret_value(SecretId=secret_name)
    return response['SecretString']

def lambda_handler(event, context):
    # Verify API key
    api_key = event.get('headers', {}).get('x-api-key')
    expected_api_key = get_api_key()

    if api_key != expected_api_key:
        return {
            'statusCode': 403,
            'headers': {
                'Content-Type': 'application/json'
            },
            'body': json.dumps({
                'error': 'Forbidden',
                'message': 'Invalid API key'
            })
        }

    # Rest of the function...

Next Steps ⏭️

With our CI/CD pipelines in place, our Cloud Resume Challenge implementation is complete! In the final post, we'll reflect on the project as a whole, discuss lessons learned, and explore potential future enhancements.

Up Next: [Cloud Resume Challenge with Terraform: Final Thoughts & Lessons Learned] 🔗

Share on Share on

March 19, 2025
in AWS, Terraform, Serverless, Cloud Resume Challenge
14 min read

Cloud Resume Challenge with Terraform: Building the Backend API 🚀

In our previous posts, we set up the frontend infrastructure for our resume website using Terraform. Now it's time to build the backend API that will power our visitor counter.

Backend Architecture Overview 🏗️

Let's take a look at the serverless architecture we'll be implementing:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│             │     │             │     │             │
│ API Gateway ├─────► Lambda      ├─────► DynamoDB    │
│             │     │             │     │             │
└─────────────┘     └─────────────┘     └─────────────┘
       │                   │                   │
       │                   │                   │
       ▼                   ▼                   ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│             │     │             │     │             │
│ CloudWatch  │     │ CloudWatch  │     │ CloudWatch  │
│   Logs      │     │   Logs      │     │   Logs      │
│             │     │             │     │             │
└─────────────┘     └─────────────┘     └─────────────┘

This architecture includes:

API Gateway: Exposes our Lambda function as a REST API
Lambda Function: Contains the Python code to increment and return the visitor count
DynamoDB: Stores the visitor count data
CloudWatch: Monitors and logs activity across all services

My Approach to DynamoDB Design 💾

Before diving into the Terraform code, I want to share my thought process on DynamoDB table design. When I initially approached this challenge, I had to decide between two approaches:

Single-counter approach: A simple table with just one item for the counter
Visitor log approach: A more detailed table that logs each visit with timestamps

I chose the second approach for a few reasons:

It allows for more detailed analytics in the future
It provides a history of visits that can be queried
It demonstrates a more realistic use case for DynamoDB

Here's my table design:

Attribute	Type	Description
visit_id	String	Primary key (UUID)
timestamp	String	ISO8601 timestamp of the visit
visitor_ip	String	Hashed IP address for privacy
user_agent	String	Browser/device information
path	String	Page path visited

This approach gives us flexibility while keeping the solution serverless and cost-effective.

Implementing the Backend API with Terraform 🛠️

Now, let's start implementing our backend infrastructure using Terraform. We'll create modules for each component, starting with DynamoDB.

1. DynamoDB Table for Visitor Counting 📊

Create a file at modules/backend/dynamodb.tf:

resource "aws_dynamodb_table" "visitor_counter" {
  name           = "ResumeVisitorCounter-${var.environment}"
  billing_mode   = "PAY_PER_REQUEST"  # On-demand capacity for cost savings
  hash_key       = "visit_id"

  attribute {
    name = "visit_id"
    type = "S"
  }

  # Add TTL for automatic data cleanup after 90 days
  ttl {
    attribute_name = "expiration_time"
    enabled        = true
  }

  point_in_time_recovery {
    enabled = true  # Enable PITR for recovery options
  }

  # Use server-side encryption
  server_side_encryption {
    enabled = true
  }

  tags = {
    Name        = "Resume Visitor Counter"
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

# Create a GSI for timestamp-based queries
resource "aws_dynamodb_table_item" "counter_init" {
  table_name = aws_dynamodb_table.visitor_counter.name
  hash_key   = aws_dynamodb_table.visitor_counter.hash_key

  # Initialize the counter with a value of 0
  item = jsonencode({
    "visit_id": {"S": "total"},
    "count": {"N": "0"}
  })

  # Only create this item on initial deployment
  lifecycle {
    ignore_changes = [item]
  }
}

I've implemented several enhancements:

Point-in-time recovery for data protection
TTL for automatic cleanup of old records
Server-side encryption for security
An initial counter item to ensure we don't have "cold start" issues

2. Lambda Function for the API Logic 🏗️

Now, let's create our Lambda function. First, we'll need the Python code. Create a file at modules/backend/lambda/visitor_counter.py:

import boto3
import json
import os
import uuid
import logging
from datetime import datetime, timedelta
import hashlib
from botocore.exceptions import ClientError

# Set up logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Initialize DynamoDB client
dynamodb = boto3.resource('dynamodb')
table_name = os.environ['DYNAMODB_TABLE']
table = dynamodb.Table(table_name)

def lambda_handler(event, context):
    """
    Lambda handler to process API Gateway requests for visitor counting.
    Increments the visitor counter and returns the updated count.
    """
    logger.info(f"Processing event: {json.dumps(event)}")

    try:
        # Extract request information
        request_context = event.get('requestContext', {})
        http_method = event.get('httpMethod', '')
        path = event.get('path', '')
        headers = event.get('headers', {})
        ip_address = request_context.get('identity', {}).get('sourceIp', 'unknown')
        user_agent = headers.get('User-Agent', 'unknown')

        # Generate a unique visit ID
        visit_id = str(uuid.uuid4())

        # Hash the IP address for privacy
        hashed_ip = hashlib.sha256(ip_address.encode()).hexdigest()

        # Get current timestamp
        timestamp = datetime.utcnow().isoformat()

        # Calculate expiration time (90 days from now)
        expiration_time = int((datetime.utcnow() + timedelta(days=90)).timestamp())

        # Log the visit
        table.put_item(
            Item={
                'visit_id': visit_id,
                'timestamp': timestamp,
                'visitor_ip': hashed_ip,
                'user_agent': user_agent,
                'path': path,
                'expiration_time': expiration_time
            }
        )

        # Update the total counter
        response = table.update_item(
            Key={'visit_id': 'total'},
            UpdateExpression='ADD #count :incr',
            ExpressionAttributeNames={'#count': 'count'},
            ExpressionAttributeValues={':incr': 1},
            ReturnValues='UPDATED_NEW'
        )

        count = int(response['Attributes']['count'])

        # Return the response
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': os.environ['ALLOWED_ORIGIN'],
                'Access-Control-Allow-Methods': 'GET, OPTIONS',
                'Access-Control-Allow-Headers': 'Content-Type'
            },
            'body': json.dumps({
                'count': count,
                'message': 'Visitor count updated successfully'
            })
        }

    except ClientError as e:
        logger.error(f"DynamoDB error: {e}")
        return {
            'statusCode': 500,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': os.environ.get('ALLOWED_ORIGIN', '*')
            },
            'body': json.dumps({
                'error': 'Database error',
                'message': str(e)
            })
        }
    except Exception as e:
        logger.error(f"General error: {e}")
        return {
            'statusCode': 500,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': os.environ.get('ALLOWED_ORIGIN', '*')
            },
            'body': json.dumps({
                'error': 'Server error',
                'message': str(e)
            })
        }

def options_handler(event, context):
    """
    Handler for OPTIONS requests to support CORS
    """
    return {
        'statusCode': 200,
        'headers': {
            'Access-Control-Allow-Origin': os.environ.get('ALLOWED_ORIGIN', '*'),
            'Access-Control-Allow-Methods': 'GET, OPTIONS',
            'Access-Control-Allow-Headers': 'Content-Type'
        },
        'body': ''
    }

Now, let's create the Lambda function using Terraform. Create a file at modules/backend/lambda.tf:

# Archive the Lambda function code
data "archive_file" "lambda_zip" {
  type        = "zip"
  source_file = "${path.module}/lambda/visitor_counter.py"
  output_path = "${path.module}/lambda/visitor_counter.zip"
}

# Create the Lambda function
resource "aws_lambda_function" "visitor_counter" {
  filename         = data.archive_file.lambda_zip.output_path
  function_name    = "resume-visitor-counter-${var.environment}"
  role             = aws_iam_role.lambda_role.arn
  handler          = "visitor_counter.lambda_handler"
  source_code_hash = data.archive_file.lambda_zip.output_base64sha256
  runtime          = "python3.9"
  timeout          = 10  # Increased timeout for better error handling
  memory_size      = 128

  environment {
    variables = {
      DYNAMODB_TABLE = aws_dynamodb_table.visitor_counter.name
      ALLOWED_ORIGIN = var.website_domain
    }
  }

  tracing_config {
    mode = "Active"  # Enable X-Ray tracing
  }

  tags = {
    Name        = "Resume Visitor Counter Lambda"
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

# Create an IAM role for the Lambda function
resource "aws_iam_role" "lambda_role" {
  name = "resume-visitor-counter-lambda-role-${var.environment}"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "lambda.amazonaws.com"
        }
      }
    ]
  })
}

# Create a custom policy for the Lambda function with least privilege
resource "aws_iam_policy" "lambda_policy" {
  name        = "resume-visitor-counter-lambda-policy-${var.environment}"
  description = "IAM policy for the visitor counter Lambda function"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "dynamodb:GetItem",
          "dynamodb:PutItem",
          "dynamodb:UpdateItem"
        ]
        Effect   = "Allow"
        Resource = aws_dynamodb_table.visitor_counter.arn
      },
      {
        Action = [
          "logs:CreateLogGroup",
          "logs:CreateLogStream",
          "logs:PutLogEvents"
        ]
        Effect   = "Allow"
        Resource = "arn:aws:logs:*:*:*"
      },
      {
        Action = [
          "xray:PutTraceSegments",
          "xray:PutTelemetryRecords"
        ]
        Effect   = "Allow"
        Resource = "*"
      }
    ]
  })
}

# Attach the policy to the IAM role
resource "aws_iam_role_policy_attachment" "lambda_policy_attachment" {
  role       = aws_iam_role.lambda_role.name
  policy_arn = aws_iam_policy.lambda_policy.arn
}

# Create a CloudWatch log group for the Lambda function
resource "aws_cloudwatch_log_group" "lambda_log_group" {
  name              = "/aws/lambda/${aws_lambda_function.visitor_counter.function_name}"
  retention_in_days = 30

  tags = {
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

# Create a Lambda function for handling OPTIONS requests (CORS)
resource "aws_lambda_function" "options_handler" {
  filename         = data.archive_file.lambda_zip.output_path
  function_name    = "resume-visitor-counter-options-${var.environment}"
  role             = aws_iam_role.lambda_role.arn
  handler          = "visitor_counter.options_handler"
  source_code_hash = data.archive_file.lambda_zip.output_base64sha256
  runtime          = "python3.9"
  timeout          = 10
  memory_size      = 128

  environment {
    variables = {
      ALLOWED_ORIGIN = var.website_domain
    }
  }

  tags = {
    Name        = "Resume Options Handler Lambda"
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

I've implemented several security and operational improvements:

Least privilege IAM policies
X-Ray tracing for performance monitoring
Proper CORS handling with a dedicated OPTIONS handler
CloudWatch log group with retention policy
Privacy-enhancing IP address hashing

3. API Gateway for Exposing the Lambda Function 🔗

Create a file at modules/backend/api_gateway.tf:

# Create the API Gateway REST API
resource "aws_api_gateway_rest_api" "visitor_counter" {
  name        = "resume-visitor-counter-${var.environment}"
  description = "API for the resume visitor counter"

  endpoint_configuration {
    types = ["REGIONAL"]
  }

  tags = {
    Name        = "Resume Visitor Counter API"
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

# Create a resource for the API
resource "aws_api_gateway_resource" "visitor_counter" {
  rest_api_id = aws_api_gateway_rest_api.visitor_counter.id
  parent_id   = aws_api_gateway_rest_api.visitor_counter.root_resource_id
  path_part   = "count"
}

# Create a GET method for the API
resource "aws_api_gateway_method" "get" {
  rest_api_id   = aws_api_gateway_rest_api.visitor_counter.id
  resource_id   = aws_api_gateway_resource.visitor_counter.id
  http_method   = "GET"
  authorization_type = "NONE"

  # Add API key requirement if needed
  # api_key_required = true
}

# Create an OPTIONS method for the API (for CORS)
resource "aws_api_gateway_method" "options" {
  rest_api_id   = aws_api_gateway_rest_api.visitor_counter.id
  resource_id   = aws_api_gateway_resource.visitor_counter.id
  http_method   = "OPTIONS"
  authorization_type = "NONE"
}

# Set up the GET method integration with Lambda
resource "aws_api_gateway_integration" "lambda_get" {
  rest_api_id = aws_api_gateway_rest_api.visitor_counter.id
  resource_id = aws_api_gateway_resource.visitor_counter.id
  http_method = aws_api_gateway_method.get.http_method

  integration_http_method = "POST"
  type                    = "AWS_PROXY"
  uri                     = aws_lambda_function.visitor_counter.invoke_arn
}

# Set up the OPTIONS method integration with Lambda
resource "aws_api_gateway_integration" "lambda_options" {
  rest_api_id = aws_api_gateway_rest_api.visitor_counter.id
  resource_id = aws_api_gateway_resource.visitor_counter.id
  http_method = aws_api_gateway_method.options.http_method

  integration_http_method = "POST"
  type                    = "AWS_PROXY"
  uri                     = aws_lambda_function.options_handler.invoke_arn
}

# Create a deployment for the API
resource "aws_api_gateway_deployment" "visitor_counter" {
  depends_on = [
    aws_api_gateway_integration.lambda_get,
    aws_api_gateway_integration.lambda_options
  ]

  rest_api_id = aws_api_gateway_rest_api.visitor_counter.id
  stage_name  = var.environment

  lifecycle {
    create_before_destroy = true
  }
}

# Add permission for API Gateway to invoke the Lambda function
resource "aws_lambda_permission" "api_gateway_lambda" {
  statement_id  = "AllowAPIGatewayInvoke"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.visitor_counter.function_name
  principal     = "apigateway.amazonaws.com"

  # The /* part allows invocation from any stage, method and resource path
  # within API Gateway
  source_arn = "${aws_api_gateway_rest_api.visitor_counter.execution_arn}/*/${aws_api_gateway_method.get.http_method}${aws_api_gateway_resource.visitor_counter.path}"
}

# Add permission for API Gateway to invoke the OPTIONS Lambda function
resource "aws_lambda_permission" "api_gateway_options_lambda" {
  statement_id  = "AllowAPIGatewayInvokeOptions"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.options_handler.function_name
  principal     = "apigateway.amazonaws.com"

  source_arn = "${aws_api_gateway_rest_api.visitor_counter.execution_arn}/*/${aws_api_gateway_method.options.http_method}${aws_api_gateway_resource.visitor_counter.path}"
}

# Enable CloudWatch logging for API Gateway
resource "aws_api_gateway_account" "main" {
  cloudwatch_role_arn = aws_iam_role.api_gateway_cloudwatch.arn
}

resource "aws_iam_role" "api_gateway_cloudwatch" {
  name = "api-gateway-cloudwatch-role-${var.environment}"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "apigateway.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "api_gateway_cloudwatch" {
  role       = aws_iam_role.api_gateway_cloudwatch.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonAPIGatewayPushToCloudWatchLogs"
}

# Set up method settings for logging and throttling
resource "aws_api_gateway_method_settings" "settings" {
  rest_api_id = aws_api_gateway_rest_api.visitor_counter.id
  stage_name  = aws_api_gateway_deployment.visitor_counter.stage_name
  method_path = "*/*"

  settings {
    metrics_enabled        = true
    logging_level          = "INFO"
    data_trace_enabled     = true
    throttling_rate_limit  = 100
    throttling_burst_limit = 50
  }
}

# Create a custom domain for the API
resource "aws_api_gateway_domain_name" "api" {
  domain_name              = "api.${var.domain_name}"
  regional_certificate_arn = var.certificate_arn

  endpoint_configuration {
    types = ["REGIONAL"]
  }

  tags = {
    Name        = "Resume API Domain"
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

# Create a base path mapping for the custom domain
resource "aws_api_gateway_base_path_mapping" "api" {
  api_id      = aws_api_gateway_rest_api.visitor_counter.id
  stage_name  = aws_api_gateway_deployment.visitor_counter.stage_name
  domain_name = aws_api_gateway_domain_name.api.domain_name
}

# Create a Route 53 record for the API domain
resource "aws_route53_record" "api" {
  name    = aws_api_gateway_domain_name.api.domain_name
  type    = "A"
  zone_id = var.hosted_zone_id

  alias {
    name                   = aws_api_gateway_domain_name.api.regional_domain_name
    zone_id                = aws_api_gateway_domain_name.api.regional_zone_id
    evaluate_target_health = false
  }
}

The API Gateway configuration includes several enhancements:

CloudWatch logging and metrics
Rate limiting and throttling to prevent abuse
Custom domain for a professional API endpoint
Proper Route 53 DNS configuration

4. Variables and Outputs 📝

Create files at modules/backend/variables.tf and modules/backend/outputs.tf:

variables.tf:

variable "environment" {
  description = "Deployment environment (e.g., dev, prod)"
  type        = string
  default     = "dev"
}

variable "website_domain" {
  description = "Domain of the resume website (for CORS)"
  type        = string
}

variable "domain_name" {
  description = "Base domain name for custom API endpoint"
  type        = string
}

variable "hosted_zone_id" {
  description = "Route 53 hosted zone ID"
  type        = string
}

variable "certificate_arn" {
  description = "ARN of the ACM certificate for the API domain"
  type        = string
}

outputs.tf:

output "api_endpoint" {
  description = "Endpoint URL of the API Gateway"
  value       = aws_api_gateway_deployment.visitor_counter.invoke_url
}

output "api_custom_domain" {
  description = "Custom domain for the API"
  value       = aws_api_gateway_domain_name.api.domain_name
}

output "dynamodb_table_name" {
  description = "Name of the DynamoDB table"
  value       = aws_dynamodb_table.visitor_counter.name
}

5. Source Control for Backend Code 📚

An important aspect of the Cloud Resume Challenge is using source control. We'll create a GitHub repository for our backend code. Here's how I organize my repository:

resume-backend/
├── .github/
│   └── workflows/
│       └── deploy.yml  # GitHub Actions workflow (we'll create this in next post)
├── lambda/
│   └── visitor_counter.py
├── terraform/
│   ├── modules/
│   │   ├── backend/
│   │   │   ├── api_gateway.tf
│   │   │   ├── dynamodb.tf
│   │   │   ├── lambda.tf
│   │   │   ├── variables.tf
│   │   │   └── outputs.tf
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
├── tests/
│   └── test_visitor_counter.py  # Python unit tests
└── README.md

Implementing Python Tests 🧪

For step 11 of the Cloud Resume Challenge, we need to include tests for our Python code. Create a file at tests/test_visitor_counter.py:

import unittest
import json
import os
import sys
from unittest.mock import patch, MagicMock

# Add lambda directory to the path so we can import the function
sys.path.append(os.path.join(os.path.dirname(__file__), '..', 'lambda'))

import visitor_counter

class TestVisitorCounter(unittest.TestCase):
    """Test cases for the visitor counter Lambda function."""

    @patch('visitor_counter.table')
    def test_lambda_handler_success(self, mock_table):
        """Test successful execution of the lambda_handler function."""
        # Mock the DynamoDB responses
        mock_put_response = MagicMock()
        mock_update_response = {
            'Attributes': {
                'count': 42
            }
        }
        mock_table.put_item.return_value = mock_put_response
        mock_table.update_item.return_value = mock_update_response

        # Set required environment variables
        os.environ['DYNAMODB_TABLE'] = 'test-table'
        os.environ['ALLOWED_ORIGIN'] = 'https://example.com'

        # Create a test event
        event = {
            'httpMethod': 'GET',
            'path': '/count',
            'headers': {
                'User-Agent': 'test-agent'
            },
            'requestContext': {
                'identity': {
                    'sourceIp': '127.0.0.1'
                }
            }
        }

        # Call the function
        response = visitor_counter.lambda_handler(event, {})

        # Assert response is correct
        self.assertEqual(response['statusCode'], 200)
        self.assertEqual(response['headers']['Content-Type'], 'application/json')
        self.assertEqual(response['headers']['Access-Control-Allow-Origin'], 'https://example.com')

        # Parse the body and check the count
        body = json.loads(response['body'])
        self.assertEqual(body['count'], 42)
        self.assertEqual(body['message'], 'Visitor count updated successfully')

        # Verify that DynamoDB was called correctly
        mock_table.put_item.assert_called_once()
        mock_table.update_item.assert_called_once_with(
            Key={'visit_id': 'total'},
            UpdateExpression='ADD #count :incr',
            ExpressionAttributeNames={'#count': 'count'},
            ExpressionAttributeValues={':incr': 1},
            ReturnValues='UPDATED_NEW'
        )

    @patch('visitor_counter.table')
    def test_lambda_handler_error(self, mock_table):
        """Test error handling in the lambda_handler function."""
        # Simulate a DynamoDB error
        mock_table.update_item.side_effect = Exception("Test error")

        # Set required environment variables
        os.environ['DYNAMODB_TABLE'] = 'test-table'
        os.environ['ALLOWED_ORIGIN'] = 'https://example.com'

        # Create a test event
        event = {
            'httpMethod': 'GET',
            'path': '/count',
            'headers': {
                'User-Agent': 'test-agent'
            },
            'requestContext': {
                'identity': {
                    'sourceIp': '127.0.0.1'
                }
            }
        }

        # Call the function
        response = visitor_counter.lambda_handler(event, {})

        # Assert response indicates an error
        self.assertEqual(response['statusCode'], 500)
        self.assertEqual(response['headers']['Content-Type'], 'application/json')

        # Parse the body and check the error message
        body = json.loads(response['body'])
        self.assertIn('error', body)
        self.assertIn('message', body)

    def test_options_handler(self):
        """Test the OPTIONS handler for CORS support."""
        # Set required environment variables
        os.environ['ALLOWED_ORIGIN'] = 'https://example.com'

        # Create a test event
        event = {
            'httpMethod': 'OPTIONS',
            'path': '/count',
            'headers': {
                'Origin': 'https://example.com'
            }
        }

        # Call the function
        response = visitor_counter.options_handler(event, {})

        # Assert response is correct for OPTIONS
        self.assertEqual(response['statusCode'], 200)
        self.assertEqual(response['headers']['Access-Control-Allow-Origin'], 'https://example.com')
        self.assertEqual(response['headers']['Access-Control-Allow-Methods'], 'GET, OPTIONS')
        self.assertEqual(response['headers']['Access-Control-Allow-Headers'], 'Content-Type')

if __name__ == '__main__':
    unittest.main()

This test suite covers:

Successful API calls
Error handling
CORS OPTIONS request handling

To run these tests, you would use the following command:

python -m unittest tests/test_visitor_counter.py

Testing the API Manually 🧪

Once you've deployed the API, you can test it manually using tools like cURL or Postman. Here's how to test with cURL:

# Get the current visitor count
curl -X GET https://api.yourdomain.com/count

# Test CORS pre-flight request
curl -X OPTIONS https://api.yourdomain.com/count \
  -H "Origin: https://yourdomain.com" \
  -H "Access-Control-Request-Method: GET" \
  -H "Access-Control-Request-Headers: Content-Type"

For Postman:

Create a new GET request to your API endpoint (https://api.yourdomain.com/count)
Send the request and verify you get a 200 response with a JSON body
Create a new OPTIONS request to test CORS
Add headers: Origin: https://yourdomain.com, Access-Control-Request-Method: GET
Send the request and verify you get a 200 response with the correct CORS headers

Setting Up CloudWatch Monitoring and Alarms ⚠️

Adding monitoring and alerting is a critical part of any production-grade API. Let's add CloudWatch alarms to notify us if something goes wrong:

# Add to modules/backend/monitoring.tf

# Alarm for Lambda errors
resource "aws_cloudwatch_metric_alarm" "lambda_errors" {
  alarm_name          = "lambda-visitor-counter-errors-${var.environment}"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "Errors"
  namespace           = "AWS/Lambda"
  period              = 60
  statistic           = "Sum"
  threshold           = 0
  alarm_description   = "This alarm monitors for errors in the visitor counter Lambda function"

  dimensions = {
    FunctionName = aws_lambda_function.visitor_counter.function_name
  }

  # Add SNS topic ARN if you want notifications
  # alarm_actions     = [aws_sns_topic.alerts.arn]
  # ok_actions        = [aws_sns_topic.alerts.arn]
}

# Alarm for API Gateway 5XX errors
resource "aws_cloudwatch_metric_alarm" "api_5xx_errors" {
  alarm_name          = "api-visitor-counter-5xx-errors-${var.environment}"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "5XXError"
  namespace           = "AWS/ApiGateway"
  period              = 60
  statistic           = "Sum"
  threshold           = 0
  alarm_description   = "This alarm monitors for 5XX errors in the visitor counter API"

  dimensions = {
    ApiName = aws_api_gateway_rest_api.visitor_counter.name
    Stage   = aws_api_gateway_deployment.visitor_counter.stage_name
  }

  # Add SNS topic ARN if you want notifications
  # alarm_actions     = [aws_sns_topic.alerts.arn]
  # ok_actions        = [aws_sns_topic.alerts.arn]
}

# Dashboard for monitoring the API
resource "aws_cloudwatch_dashboard" "api_dashboard" {
  dashboard_name = "visitor-counter-dashboard-${var.environment}"

  dashboard_body = jsonencode({
    widgets = [
      {
        type   = "metric"
        x      = 0
        y      = 0
        width  = 12
        height = 6
        properties = {
          metrics = [
            ["AWS/ApiGateway", "Count", "ApiName", aws_api_gateway_rest_api.visitor_counter.name, "Stage", aws_api_gateway_deployment.visitor_counter.stage_name]
          ]
          period = 300
          stat   = "Sum"
          region = "us-east-1"
          title  = "API Requests"
        }
      },
      {
        type   = "metric"
        x      = 12
        y      = 0
        width  = 12
        height = 6
        properties = {
          metrics = [
            ["AWS/ApiGateway", "4XXError", "ApiName", aws_api_gateway_rest_api.visitor_counter.name, "Stage", aws_api_gateway_deployment.visitor_counter.stage_name],
            ["AWS/ApiGateway", "5XXError", "ApiName", aws_api_gateway_rest_api.visitor_counter.name, "Stage", aws_api_gateway_deployment.visitor_counter.stage_name]
          ]
          period = 300
          stat   = "Sum"
          region = "us-east-1"
          title  = "API Errors"
        }
      },
      {
        type   = "metric"
        x      = 0
        y      = 6
        width  = 12
        height = 6
        properties = {
          metrics = [
            ["AWS/Lambda", "Invocations", "FunctionName", aws_lambda_function.visitor_counter.function_name],
            ["AWS/Lambda", "Errors", "FunctionName", aws_lambda_function.visitor_counter.function_name]
          ]
          period = 300
          stat   = "Sum"
          region = "us-east-1"
          title  = "Lambda Invocations and Errors"
        }
      },
      {
        type   = "metric"
        x      = 12
        y      = 6
        width  = 12
        height = 6
        properties = {
          metrics = [
            ["AWS/Lambda", "Duration", "FunctionName", aws_lambda_function.visitor_counter.function_name]
          ]
          period = 300
          stat   = "Average"
          region = "us-east-1"
          title  = "Lambda Duration"
        }
      }
    ]
  })
}

Debugging Common API Issues 🐛

During my implementation, I encountered several challenges:

CORS Issues: The most common problem was with CORS configuration. Make sure your API Gateway and Lambda function both return the proper CORS headers.
IAM Permission Errors: Initially, I gave my Lambda function too many permissions, then too few. The policy shown above represents the minimal set of permissions needed.
DynamoDB Initialization: The counter needs to be initialized with a value. I solved this by adding an item to the table during deployment.
API Gateway Integration: Make sure your Lambda function and API Gateway are correctly integrated. Check for proper resource paths and method settings.

Lessons Learned 💡

DynamoDB Design: My initial design was too simple. Adding more fields like timestamp and user-agent provides valuable analytics data.
Error Handling: Robust error handling is critical for serverless applications. Without proper logging, debugging becomes nearly impossible.
Testing Strategy: Writing tests before implementing the Lambda function (test-driven development) helped me think through edge cases and error scenarios.
Security Considerations: Privacy is important. Hashing IP addresses and implementing proper IAM policies ensures we protect user data.

API Security Considerations 🔒

Security was a primary concern when building this API. Here are the key security measures I implemented:

Least Privilege IAM Policies: The Lambda function has only the minimal permissions needed.
Input Validation: The Lambda function validates and sanitizes all input.
Rate Limiting: API Gateway is configured with throttling to prevent abuse.
HTTPS Only: All API endpoints use HTTPS with modern TLS settings.
CORS Configuration: Only the resume website domain is allowed to make cross-origin requests.
Privacy Protection: IP addresses are hashed to protect visitor privacy.

These measures help protect against common API vulnerabilities like injection attacks, denial of service, and data exposure.

Enhancements and Mods 🚀

Here are some ways to extend this part of the challenge:

Developer Mod: Schemas and Dreamers

Instead of using DynamoDB, consider implementing a relational database approach:

resource "aws_db_subnet_group" "database" {
  name       = "resume-database-subnet-group"
  subnet_ids = var.private_subnet_ids
}

resource "aws_security_group" "database" {
  name        = "resume-database-sg"
  description = "Security group for the resume database"
  vpc_id      = var.vpc_id

  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.lambda.id]
  }
}

resource "aws_db_instance" "postgresql" {
  allocated_storage      = 20
  storage_type           = "gp2"
  engine                 = "postgres"
  engine_version         = "13.4"
  instance_class         = "db.t3.micro"
  db_name                = "resumedb"
  username               = "postgres"
  password               = var.db_password
  parameter_group_name   = "default.postgres13"
  db_subnet_group_name   = aws_db_subnet_group.database.name
  vpc_security_group_ids = [aws_security_group.database.id]
  skip_final_snapshot    = true
  multi_az               = false

  tags = {
    Name        = "Resume Database"
    Environment = var.environment
  }
}

This approach introduces interesting networking challenges and requires modifications to your Lambda function to connect to PostgreSQL.

DevOps Mod: Monitor Lizard

Enhance monitoring with X-Ray traces and custom CloudWatch metrics:

# Add to Lambda function configuration
tracing_config {
  mode = "Active"
}

# Add X-Ray policy
resource "aws_iam_policy" "lambda_xray" {
  name        = "lambda-xray-policy-${var.environment}"
  description = "IAM policy for X-Ray tracing"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "xray:PutTraceSegments",
          "xray:PutTelemetryRecords"
        ]
        Effect   = "Allow"
        Resource = "*"
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "lambda_xray" {
  role       = aws_iam_role.lambda_role.name
  policy_arn = aws_iam_policy.lambda_xray.arn
}

Then modify your Lambda function to emit custom metrics:

import boto3
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all

# Patch all supported libraries for X-Ray
patch_all()

cloudwatch = boto3.client('cloudwatch')

# Inside lambda_handler
cloudwatch.put_metric_data(
    Namespace='ResumeMetrics',
    MetricData=[
        {
            'MetricName': 'VisitorCount',
            'Value': count,
            'Unit': 'Count'
        }
    ]
)

Security Mod: Check Your Privilege

Implement AWS WAF to protect your API from common web attacks:

resource "aws_wafv2_web_acl" "api" {
  name        = "api-waf-${var.environment}"
  description = "WAF for the resume API"
  scope       = "REGIONAL"

  default_action {
    allow {}
  }

  rule {
    name     = "AWSManagedRulesCommonRuleSet"
    priority = 0

    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        name        = "AWSManagedRulesCommonRuleSet"
        vendor_name = "AWS"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "AWSManagedRulesCommonRuleSetMetric"
      sampled_requests_enabled   = true
    }
  }

  rule {
    name     = "RateLimit"
    priority = 1

    action {
      block {}
    }

    statement {
      rate_based_statement {
        limit              = 100
        aggregate_key_type = "IP"
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "RateLimitMetric"
      sampled_requests_enabled   = true
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name                = "APIWebACLMetric"
    sampled_requests_enabled   = true
  }
}

resource "aws_wafv2_web_acl_association" "api" {
  resource_arn = aws_api_gateway_stage.visitor_counter.arn
  web_acl_arn  = aws_wafv2_web_acl.api.arn
}

Next Steps ⏭️

With our backend API completed, we're ready to connect it to our frontend in the next post. We'll integrate the JavaScript visitor counter with our API and then automate the deployment process using GitHub Actions.

Stay tuned to see how we bring the full stack together!

Up Next: [Cloud Resume Challenge with Terraform: Automating Deployments with GitHub Actions] 🔗

Share on Share on

March 18, 2025
in AWS, Terraform, Web Hosting, Cloud Resume Challenge
8 min read

Cloud Resume Challenge with Terraform: Deploying the Static Website 🚀

Introduction 🌍

In the previous post, we set up our Terraform environment and outlined the architecture for our Cloud Resume Challenge project. Now it's time to start building! In this post, we'll focus on deploying the first component: the static website that will host our resume.

Frontend Architecture Overview 🏗️

Let's look at the specific architecture we'll implement for our frontend:

┌───────────┐     ┌────────────┐     ┌──────────┐     ┌────────────┐
│           │     │            │     │          │     │            │
│  Route 53 ├─────► CloudFront ├─────►    S3    │     │    ACM     │
│           │     │            │     │          │     │ Certificate│
└───────────┘     └────────────┘     └──────────┘     └────────────┘
      ▲                                    ▲                 │
      │                                    │                 │
      └────────────────────────────────────┴─────────────────┘
           DNS & Certificate Validation

The frontend consists of:

S3 Bucket: Hosts our HTML, CSS, and JavaScript files
CloudFront: Provides CDN capabilities for global distribution and HTTPS
Route 53: Manages our custom domain's DNS
ACM: Provides SSL/TLS certificate for HTTPS

My HTML/CSS Resume Design Approach 🎨

Before diving into Terraform, I spent some time creating my resume in HTML and CSS. Rather than starting from scratch, I decided to use a minimalist approach with a focus on readability.

Here's a snippet of my HTML structure:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Matthew's Cloud Resume</title>
    <link rel="stylesheet" href="styles.css">
</head>
<body>
    <header>
        <h1>Matthew Johnson</h1>
        <p>Cloud Engineer</p>
    </header>

    <section id="contact">
        <!-- Contact information -->
    </section>

    <section id="skills">
        <!-- Skills list -->
    </section>

    <section id="experience">
        <!-- Work experience -->
    </section>

    <section id="education">
        <!-- Education history -->
    </section>

    <section id="certifications">
        <!-- AWS certifications -->
    </section>

    <section id="projects">
        <!-- Project descriptions including this challenge -->
    </section>

    <section id="counter">
        <p>This page has been viewed <span id="count">0</span> times.</p>
    </section>

    <footer>
        <!-- Footer content -->
    </footer>

    <script src="counter.js"></script>
</body>
</html>

For CSS, I went with a responsive design that works well on both desktop and mobile devices:

:root {
    --primary-color: #0066cc;
    --secondary-color: #f4f4f4;
    --text-color: #333;
    --heading-color: #222;
}

body {
    font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
    line-height: 1.6;
    color: var(--text-color);
    max-width: 800px;
    margin: 0 auto;
    padding: 1rem;
}

header {
    text-align: center;
    margin-bottom: 2rem;
}

h1, h2, h3 {
    color: var(--heading-color);
}

section {
    margin-bottom: 2rem;
}

/* Responsive design */
@media (max-width: 600px) {
    body {
        padding: 0.5rem;
    }
}

These files will be uploaded to our S3 bucket once we've provisioned it with Terraform.

Deploying the Static Website with Terraform 🌐

Now, let's implement the Terraform code for our frontend infrastructure. We'll create modules for each component, starting with S3.

1. S3 Module for Website Hosting 📂

Create a file at modules/frontend/s3.tf:

resource "aws_s3_bucket" "website" {
  bucket = var.website_bucket_name

  tags = {
    Name        = "Resume Website"
    Environment = var.environment
    Project     = "Cloud Resume Challenge"
  }
}

resource "aws_s3_bucket_website_configuration" "website" {
  bucket = aws_s3_bucket.website.id

  index_document {
    suffix = "index.html"
  }

  error_document {
    key = "error.html"
  }
}

resource "aws_s3_bucket_cors_configuration" "website" {
  bucket = aws_s3_bucket.website.id

  cors_rule {
    allowed_headers = ["*"]
    allowed_methods = ["GET", "HEAD"]
    allowed_origins = ["*"]  # In production, restrict to your domain
    expose_headers  = ["ETag"]
    max_age_seconds = 3000
  }
}

resource "aws_s3_bucket_policy" "website" {
  bucket = aws_s3_bucket.website.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "PublicReadGetObject"
        Effect    = "Allow"
        Principal = "*"
        Action    = "s3:GetObject"
        Resource  = "${aws_s3_bucket.website.arn}/*"
      }
    ]
  })
}

# Enable versioning for rollback capability
resource "aws_s3_bucket_versioning" "website" {
  bucket = aws_s3_bucket.website.id
  versioning_configuration {
    status = "Enabled"
  }
}

# Add encryption for security
resource "aws_s3_bucket_server_side_encryption_configuration" "website" {
  bucket = aws_s3_bucket.website.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

Notice that I've included CORS configuration, which will be essential later when we integrate with our API. I also added encryption and versioning for better security and disaster recovery.

2. ACM Certificate Module 🔒

Create a file at modules/frontend/acm.tf:

resource "aws_acm_certificate" "website" {
  domain_name       = var.domain_name
  validation_method = "DNS"

  subject_alternative_names = ["www.${var.domain_name}"]

  lifecycle {
    create_before_destroy = true
  }

  tags = {
    Name        = "Resume Website Certificate"
    Environment = var.environment
  }
}

resource "aws_acm_certificate_validation" "website" {
  certificate_arn         = aws_acm_certificate.website.arn
  validation_record_fqdns = [for record in aws_route53_record.certificate_validation : record.fqdn]

  # Wait for DNS propagation
  timeouts {
    create = "30m"
  }
}

3. Route 53 for DNS Configuration 📡

Create a file at modules/frontend/route53.tf:

data "aws_route53_zone" "selected" {
  name         = var.root_domain_name
  private_zone = false
}

resource "aws_route53_record" "website" {
  zone_id = data.aws_route53_zone.selected.zone_id
  name    = var.domain_name
  type    = "A"

  alias {
    name                   = aws_cloudfront_distribution.website.domain_name
    zone_id                = aws_cloudfront_distribution.website.hosted_zone_id
    evaluate_target_health = false
  }
}

resource "aws_route53_record" "www" {
  zone_id = data.aws_route53_zone.selected.zone_id
  name    = "www.${var.domain_name}"
  type    = "A"

  alias {
    name                   = aws_cloudfront_distribution.website.domain_name
    zone_id                = aws_cloudfront_distribution.website.hosted_zone_id
    evaluate_target_health = false
  }
}

resource "aws_route53_record" "certificate_validation" {
  for_each = {
    for dvo in aws_acm_certificate.website.domain_validation_options : dvo.domain_name => {
      name   = dvo.resource_record_name
      record = dvo.resource_record_value
      type   = dvo.resource_record_type
    }
  }

  allow_overwrite = true
  name            = each.value.name
  records         = [each.value.record]
  ttl             = 60
  type            = each.value.type
  zone_id         = data.aws_route53_zone.selected.zone_id
}

4. CloudFront Distribution for CDN and HTTPS 🌍

Create a file at modules/frontend/cloudfront.tf:

resource "aws_cloudfront_distribution" "website" {
  origin {
    domain_name = aws_s3_bucket.website.bucket_regional_domain_name
    origin_id   = "S3-${var.website_bucket_name}"

    s3_origin_config {
      origin_access_identity = aws_cloudfront_origin_access_identity.website.cloudfront_access_identity_path
    }
  }

  enabled             = true
  is_ipv6_enabled     = true
  default_root_object = "index.html"
  aliases             = [var.domain_name, "www.${var.domain_name}"]
  price_class         = "PriceClass_100"

  default_cache_behavior {
    allowed_methods  = ["GET", "HEAD", "OPTIONS"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "S3-${var.website_bucket_name}"

    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 3600
    max_ttl                = 86400
    compress               = true
  }

  # Cache behaviors for specific patterns
  ordered_cache_behavior {
    path_pattern     = "*.js"
    allowed_methods  = ["GET", "HEAD", "OPTIONS"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "S3-${var.website_bucket_name}"

    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 86400
    max_ttl                = 31536000
    compress               = true
  }

  ordered_cache_behavior {
    path_pattern     = "*.css"
    allowed_methods  = ["GET", "HEAD", "OPTIONS"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "S3-${var.website_bucket_name}"

    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl                = 0
    default_ttl            = 86400
    max_ttl                = 31536000
    compress               = true
  }

  # Restrict access to North America and Europe
  restrictions {
    geo_restriction {
      restriction_type = "whitelist"
      locations        = ["US", "CA", "GB", "DE", "FR", "ES", "IT"]
    }
  }

  viewer_certificate {
    acm_certificate_arn      = aws_acm_certificate.website.arn
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1.2_2021"
  }

  # Add custom error response
  custom_error_response {
    error_code            = 404
    response_code         = 404
    response_page_path    = "/error.html"
    error_caching_min_ttl = 10
  }

  tags = {
    Name        = "Resume Website CloudFront"
    Environment = var.environment
  }

  depends_on = [aws_acm_certificate_validation.website]
}

resource "aws_cloudfront_origin_access_identity" "website" {
  comment = "Access identity for Resume Website CloudFront"
}

# Update S3 bucket policy to allow access from CloudFront
resource "aws_s3_bucket_policy" "cloudfront_access" {
  bucket = aws_s3_bucket.website.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "AllowCloudFrontServicePrincipal"
        Effect    = "Allow"
        Principal = {
          Service = "cloudfront.amazonaws.com"
        }
        Action    = "s3:GetObject"
        Resource  = "${aws_s3_bucket.website.arn}/*"
        Condition = {
          StringEquals = {
            "AWS:SourceArn" = aws_cloudfront_distribution.website.arn
          }
        }
      }
    ]
  })
}

I've implemented several security enhancements:

Using origin access control for CloudFront
Restricting the content to specific geographic regions
Setting TLS to more modern protocols
Creating custom error pages
Adding better cache controls for different file types

5. Variables and Outputs 📝

Create files at modules/frontend/variables.tf and modules/frontend/outputs.tf:

variables.tf:

variable "website_bucket_name" {
  description = "Name of the S3 bucket to store website content"
  type        = string
}

variable "domain_name" {
  description = "Domain name for the website"
  type        = string
}

variable "root_domain_name" {
  description = "Root domain name to find Route 53 hosted zone"
  type        = string
}

variable "environment" {
  description = "Deployment environment (e.g., dev, prod)"
  type        = string
  default     = "dev"
}

outputs.tf:

output "website_bucket_name" {
  description = "Name of the S3 bucket hosting the website"
  value       = aws_s3_bucket.website.id
}

output "cloudfront_distribution_id" {
  description = "ID of the CloudFront distribution"
  value       = aws_cloudfront_distribution.website.id
}

output "website_domain" {
  description = "Domain name of the website"
  value       = var.domain_name
}

output "cloudfront_domain_name" {
  description = "CloudFront domain name"
  value       = aws_cloudfront_distribution.website.domain_name
}

6. Main Module Configuration 🔄

Now, let's create the main configuration in main.tf that uses our frontend module:

provider "aws" {
  region = "us-east-1"
}

module "frontend" {
  source = "./modules/frontend"

  website_bucket_name = "my-resume-website-${var.environment}"
  domain_name         = var.domain_name
  root_domain_name    = var.root_domain_name
  environment         = var.environment
}

In variables.tf at the root level:

variable "environment" {
  description = "Deployment environment (e.g., dev, prod)"
  type        = string
  default     = "dev"
}

variable "domain_name" {
  description = "Domain name for the website"
  type        = string
}

variable "root_domain_name" {
  description = "Root domain name to find Route 53 hosted zone"
  type        = string
}

7. Uploading Content to S3 📤

We can use Terraform to upload our website files to S3:

# Add to modules/frontend/s3.tf
resource "aws_s3_object" "html" {
  bucket       = aws_s3_bucket.website.id
  key          = "index.html"
  source       = "${path.module}/../../website/index.html"
  content_type = "text/html"
  etag         = filemd5("${path.module}/../../website/index.html")
}

resource "aws_s3_object" "css" {
  bucket       = aws_s3_bucket.website.id
  key          = "styles.css"
  source       = "${path.module}/../../website/styles.css"
  content_type = "text/css"
  etag         = filemd5("${path.module}/../../website/styles.css")
}

resource "aws_s3_object" "js" {
  bucket       = aws_s3_bucket.website.id
  key          = "counter.js"
  source       = "${path.module}/../../website/counter.js"
  content_type = "application/javascript"
  etag         = filemd5("${path.module}/../../website/counter.js")
}

resource "aws_s3_object" "error_page" {
  bucket       = aws_s3_bucket.website.id
  key          = "error.html"
  source       = "${path.module}/../../website/error.html"
  content_type = "text/html"
  etag         = filemd5("${path.module}/../../website/error.html")
}

Testing Your Deployment 🧪

After applying these Terraform configurations, you'll want to test that everything is working correctly:

# Initialize Terraform
terraform init

# Plan the deployment
terraform plan -var="domain_name=resume.yourdomain.com" -var="root_domain_name=yourdomain.com" -var="environment=dev"

# Apply the changes
terraform apply -var="domain_name=resume.yourdomain.com" -var="root_domain_name=yourdomain.com" -var="environment=dev"

Once deployment is complete, verify:

Your domain resolves to your CloudFront distribution
HTTPS is working correctly
Your resume appears as expected
The website is accessible from different locations

Troubleshooting Common Issues ⚠️

During my implementation, I encountered several challenges:

ACM Certificate Validation Delays: It can take up to 30 minutes for certificate validation to complete. Be patient or use the AWS console to monitor progress.
CloudFront Distribution Propagation: CloudFront changes can take 15-20 minutes to propagate globally. If your site isn't loading correctly, wait and try again.
S3 Bucket Policy Conflicts: If you receive errors about conflicting bucket policies, ensure that you're not applying multiple policies to the same bucket.
CORS Configuration: Without proper CORS headers, your JavaScript won't be able to communicate with your API when we build it in the next post.

CORS Configuration for API Integration 🔄

The Cloud Resume Challenge requires a JavaScript visitor counter that communicates with an API. To prepare for this, I've added CORS configuration to our S3 bucket. When we implement the API in the next post, we'll need to ensure it allows requests from our domain.

Here's the JavaScript snippet we'll use for the counter (to be implemented fully in the next post):

// counter.js
document.addEventListener('DOMContentLoaded', function() {
  // We'll need to fetch from our API
  // Example: https://api.yourdomain.com/visitor-count

  // For now, just a placeholder
  document.getElementById('count').innerText = 'Loading...';

  // This will be implemented fully when we create our API
  // fetch('https://api.yourdomain.com/visitor-count')
  //   .then(response => response.json())
  //   .then(data => {
  //     document.getElementById('count').innerText = data.count;
  //   })
  //   .catch(error => console.error('Error fetching visitor count:', error));
});

Lessons Learned 💡

Domain Verification: I initially struggled with ACM certificate validation. The key lesson was to ensure that the Route 53 hosted zone existed before attempting to create validation records.
Terraform State Management: When modifying existing resources, it's important to understand how Terraform tracks state. A single typo can lead to resource recreation rather than updates.
Performance Optimization: Adding specific cache behaviors for CSS and JS files significantly improved page load times. It's worth taking the time to optimize these settings.
Security Considerations: Setting up proper bucket policies and CloudFront origin access identity is critical to prevent direct access to your S3 bucket while still allowing CloudFront to serve content.

Enhancements and Mods 🚀

Here are some ways to extend this part of the challenge:

Developer Mod: Static Site Generator

Instead of writing plain HTML/CSS, consider using a static site generator like Hugo or Jekyll:

Install Hugo: brew install hugo (on macOS) or equivalent for your OS
Create a new site: hugo new site resume-site
Choose a theme or create your own
Generate the site: hugo -D
Modify your Terraform to upload the public directory contents to S3

This approach gives you templating capabilities, making it easier to update and maintain your resume.

DevOps Mod: Content Invalidation Lambda

Create a Lambda function that automatically invalidates CloudFront cache when new content is uploaded to S3:

resource "aws_lambda_function" "invalidation" {
  filename      = "lambda_function.zip"
  function_name = "cloudfront-invalidation"
  role          = aws_iam_role.lambda_role.arn
  handler       = "index.handler"
  runtime       = "nodejs14.x"

  environment {
    variables = {
      DISTRIBUTION_ID = aws_cloudfront_distribution.website.id
    }
  }
}

resource "aws_s3_bucket_notification" "bucket_notification" {
  bucket = aws_s3_bucket.website.id

  lambda_function {
    lambda_function_arn = aws_lambda_function.invalidation.arn
    events              = ["s3:ObjectCreated:*", "s3:ObjectRemoved:*"]
  }
}

Security Mod: Implement DNSSEC

To prevent DNS spoofing attacks, implement DNSSEC for your domain:

resource "aws_route53_key_signing_key" "example" {
  hosted_zone_id             = data.aws_route53_zone.selected.id
  key_management_service_arn = aws_kms_key.dnssec.arn
  name                       = "example"
}

resource "aws_route53_hosted_zone_dnssec" "example" {
  hosted_zone_id = aws_route53_key_signing_key.example.hosted_zone_id
}

resource "aws_kms_key" "dnssec" {
  customer_master_key_spec = "ECC_NIST_P256"
  deletion_window_in_days  = 7
  key_usage                = "SIGN_VERIFY"
  policy = jsonencode({
    Statement = [
      {
        Action = [
          "kms:DescribeKey",
          "kms:GetPublicKey",
          "kms:Sign",
        ],
        Effect = "Allow",
        Principal = {
          Service = "dnssec-route53.amazonaws.com"
        },
        Resource = "*"
      },
      {
        Action = "kms:*",
        Effect = "Allow",
        Principal = {
          AWS = "*"
        },
        Resource = "*"
      }
    ]
    Version = "2012-10-17"
  })
}

Next Steps ⏭️

With our static website infrastructure in place, we now have a live resume hosted on AWS with a custom domain and HTTPS. In the next post, we'll build the backend API using API Gateway, Lambda, and DynamoDB to track visitor counts.

Stay tuned to see how we implement the serverless backend and connect it to our frontend!

Up Next: [Cloud Resume Challenge with Terraform: Building the Backend API] 🔗

Share on Share on

March 17, 2025
in AWS, Terraform, DevOps, Cloud Resume Challenge
5 min read

Cloud Resume Challenge with Terraform: Introduction & Setup 🚀

Introduction 🌍

The Cloud Resume Challenge is a hands-on project designed to build a real-world cloud application while showcasing your skills in AWS, serverless architecture, and automation. Many implementations of this challenge use AWS SAM or manual setup via the AWS console, but in this series, I will demonstrate how to build the entire infrastructure using Terraform. 💡

My Journey to Terraform 🧰

When I first discovered the Cloud Resume Challenge, I was immediately intrigued by the hands-on approach to learning cloud technologies. Having some experience with traditional IT but wanting to transition to a more cloud-focused role, I saw this challenge as the perfect opportunity to showcase my skills.

I chose Terraform over AWS SAM or CloudFormation because:

Multi-cloud flexibility - While this challenge focuses on AWS, Terraform skills transfer to Azure, GCP, and other providers
Declarative approach - I find the HCL syntax more intuitive than YAML for defining infrastructure
Industry adoption - In my research, I found that Terraform was highly sought after in job postings
Strong community - The extensive module registry and community support made learning easier

This series reflects my personal journey through the challenge, including the obstacles I overcame and the lessons I learned along the way.

Why Terraform? 🛠️

Terraform allows for Infrastructure as Code (IaC), which:

Automates resource provisioning 🤖
Ensures consistency across environments ✅
Improves security by managing configurations centrally 🔒
Enables version control for infrastructure changes 📝

This series assumes basic knowledge of Terraform and will focus on highlighting key Terraform code snippets rather than full configuration files.

Project Overview 🏗️

Let's visualize the architecture we'll be building throughout this series:

Basic Project Diagram

AWS Services Used ☁️

The project consists of the following AWS components:

Frontend: Static website hosted on S3 and delivered via CloudFront.
Backend API: API Gateway, Lambda, and DynamoDB to track visitor counts.
Security: IAM roles, API Gateway security, and AWS Certificate Manager (ACM) for HTTPS 🔐.
Automation: CI/CD with GitHub Actions to deploy infrastructure and update website content ⚡.

Terraform Module Breakdown 🧩

To keep the infrastructure modular and maintainable, we will define Terraform modules for each major component:

S3 Module 📂: Manages the static website hosting.
CloudFront Module 🌍: Ensures fast delivery and HTTPS encryption.
Route 53 Module 📡: Handles DNS configuration.
DynamoDB Module 📊: Stores visitor count data.
Lambda Module 🏗️: Defines the backend API logic.
API Gateway Module 🔗: Exposes the Lambda function via a REST API.
ACM Module 🔒: Provides SSL/TLS certificates for secure communication.

Setting Up Terraform ⚙️

Before deploying any resources, we need to set up Terraform and backend state management to store infrastructure changes securely.

1. Install Terraform & AWS CLI 🖥️

Ensure you have the necessary tools installed:

# Install Terraform
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform

# Install AWS CLI
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
sudo installer -pkg AWSCLIV2.pkg -target /

2. Configure AWS Credentials Securely 🔑

Terraform interacts with AWS via credentials. Setting these up securely is crucial to avoid exposing sensitive information.

Setting up AWS Account Structure

Following cloud security best practices, I recommend creating a proper AWS account structure:

Create a management AWS account for your organization
Enable Multi-Factor Authentication (MFA) on the root account
Create separate AWS accounts for development and production environments
Set up AWS IAM Identity Center (formerly SSO) for secure access

If you're just getting started, you can begin with a simpler setup:

# Configure AWS CLI with a dedicated IAM user (not root account)
aws configure

# Test your configuration
aws sts get-caller-identity

Set up IAM permissions for Terraform by ensuring your IAM user has the necessary policies for provisioning resources. Start with a least privilege approach and add permissions as needed.

3. Set Up Remote Backend for Terraform State 🏢

Using a remote backend (such as an S3 bucket) prevents local state loss and enables collaboration.

Project Directory Structure

Here's how I've organized my Terraform project:

cloud-resume-challenge/
├── modules/
│   ├── frontend/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── backend/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── networking/
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
├── environments/
│   ├── dev/
│   │   └── main.tf
│   └── prod/
│       └── main.tf
├── terraform.tf (backend config)
├── variables.tf
├── outputs.tf
└── main.tf

Define the backend in `terraform.tf`

terraform {
  backend "s3" {
    bucket         = "my-terraform-state-bucket"
    key            = "cloud-resume/state.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-lock"
  }

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.16"
    }
  }

  required_version = ">= 1.2.0"
}

Create S3 Bucket and DynamoDB Table for Backend

Before you can use an S3 backend, you need to create the bucket and DynamoDB table. I prefer to do this via Terraform as well, using a separate configuration:

# backend-setup/main.tf
provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "terraform_state" {
  bucket = "my-terraform-state-bucket"
}

resource "aws_s3_bucket_versioning" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-lock"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }
}

Run these commands to set up your backend:

cd backend-setup
terraform init
terraform apply
cd ..
terraform init  # Initialize with the S3 backend

A Note on Security 🔒

Throughout this series, I'll be emphasizing security best practices. Some key principles to keep in mind:

Never commit AWS credentials to your repository
Use IAM roles with least privilege for all resources
Enable encryption for sensitive data
Implement proper security groups and network ACLs
Regularly rotate credentials and keys

These principles will be applied to our infrastructure as we build it in the upcoming posts.

Lessons Learned 💡

In my initial attempts at setting up the Terraform environment, I encountered several challenges:

State file management: I initially stored state locally, which caused problems when working from different computers. Switching to S3 backend solved this issue.
Module organization: I tried several directory structures before settling on the current one. Organizing by component type rather than AWS service made the most sense for this project.
Version constraints: Not specifying version constraints for providers led to unexpected behavior when Terraform updated. Always specify your provider versions!

Next Steps ⏭️

In the next post, we'll build the static website infrastructure with S3, CloudFront, Route 53, and ACM. We'll create Terraform modules for each component and deploy them together to host our resume.

Developer Mod: Advanced Terraform Techniques 🚀

If you're familiar with Terraform and want to take this challenge further, consider implementing these enhancements:

Terraform Cloud Integration: Connect your repository to Terraform Cloud for enhanced collaboration and run history.
Terratest: Add infrastructure tests using the Terratest framework to validate your configurations.
Custom Terraform Modules: Create reusable modules and publish them to the Terraform Registry.
Terraform Workspaces: Use workspaces to manage multiple environments (dev, staging, prod) within the same Terraform configuration.

Up Next: [Cloud Resume Challenge with Terraform: Deploying the Static Website] 🔗

Share on Share on

March 6, 2025
in Terraform, IaC, GitOps
14 min read

GitOps Project: Automating AWS infrastructure deployment with Terraform and GitHub Actions

Overview

As infrastructure becomes more complex, ensuring automation, security, and compliance is crucial. This blog post walks through setting up a GitOps pipeline that simplifies AWS infrastructure management using Terraform and GitHub Actions. Whether you're an engineer looking to automate deployments or a security-conscious DevOps professional, this guide provides a structured approach to implementing Infrastructure as Code (IaC) best practices.

This blog post documents my experience with the MoreThanCertified GitOps Minicamp, where students set up a fully automated GitOps pipeline to manage AWS infrastructure using Terraform and GitHub Actions. The goal is to implement Infrastructure as Code (IaC) best practices, enforce security policies, and automate infrastructure deployment.

This tutorial is meant as an overview "at-a-glance" summary of the steps I took to implement the pipeline from project start, to resource deployment. The MoreThanCertified GitOps Minicamp goes into a lot more detail and covers all the topics in this post in more depth. I would highly recommend checking the course out if you haven't already.

In this tutorial, I will guide you through:

Setting up GitHub Codespaces for development
Configuring a Terraform backend in AWS with CloudFormation
Deploying AWS resources with Terraform
Running security, cost, and policy checks using GitHub Actions
Implementing a CI/CD pipeline with GitHub Actions

Visualizing the Pipeline Architecture

The diagram below illustrates the end-to-end GitOps workflow:

GitHub Actions automates security scans, cost checks, and Terraform execution.
Terraform manages AWS infrastructure, storing state in an S3 backend.
OpenID Connect (OIDC) ensures secure authentication to AWS.

Basic Project Diagram

Setting Up GitHub Codespaces for Development

Prerequisites

Previous knowledge and use of Github is assumed.
A GitHub repository needs to be created where the Terraform code will be stored.
GitHub Codespaces enabled in your repository settings.

Enable Codespaces in GitHub

Navigate to Settings > Codespaces in your repository.
Ensure that Codespaces is enabled for your repository.
Create a new Codespace and open it.
The Codespace should open the browser based IDE, in the root of the repo you chose.

Configure Development Environment

Install Terraform in Codespaces

Copy, paste, and then run the following in your terminal.

Terraform install script

# Update package list and install dependencies
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common

# Add HashiCorp’s GPG key
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg

# Add the Terraform repository
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list

# Update and install Terraform
sudo apt-get update && sudo apt-get install -y terraform

# Verify installation
terraform version

Verify Terraform Installation

terraform -version

Configuring a Terraform Backend in AWS Using CloudFormation

Deploying an OIDC Role for GitHub Actions

To allow GitHub Actions to authenticate securely with AWS, we use an OIDC (OpenID Connect) role. This CloudFormation template sets up the necessary IAM role and OIDC provider.

Add the following code to a new template file cfn > oidc-role.yaml

oidc-role.yaml

Parameters:
  Repo:
    Description: The GitHub organization/repo for which the OIDC provider is set up
    Type: String 
Resources:
  MyOIDCProvider:
    Type: 'AWS::IAM::OIDCProvider'
    Properties:
      Url: 'https://token.actions.githubusercontent.com'
      ClientIdList:
        - sts.amazonaws.com
      ThumbprintList:
        - 6938fd4d98bab03faadb97b34396831e3780aea1
        - 1c58a3a8518e8759bf075b76b750d4f2df264fcd
  gitops2024Role:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Federated: !Sub >-
                arn:aws:iam::${AWS::AccountId}:oidc-provider/token.actions.githubusercontent.com
            Action: 'sts:AssumeRoleWithWebIdentity'
            Condition:
              StringLike:
                'token.actions.githubusercontent.com:sub': !Sub 'repo:${Repo}:*'
              StringEquals:
                'token.actions.githubusercontent.com:aud': sts.amazonaws.com
      ManagedPolicyArns:
        - 'arn:aws:iam::aws:policy/PowerUserAccess'
Outputs:
  RoleName:
    Description: 'The name of the IAM role for GitHub Actions'
    Value:
      Ref: gitops2024Role
    Export:
      Name:
        Fn::Sub: '${AWS::StackName}-RoleName'

Prerequisites

You must have an AWS account set up in advance.
An IAM user with sufficient permissions to create S3 buckets, DynamoDB tables, and IAM roles.
A remote Terraform backend ensures state consistency and allows multiple users to collaborate.

Why CloudFormation?

CloudFormation is used to create the backend infrastructure as a best practice. This ensures automation and easy re-deployment.

Steps (CodeSpaces)

Add the following code to a new template file cfn > backend-resources.yaml

backend-resources.yml

AWSTemplateFormatVersion: '2010-09-09'
Description: CloudFormation template to create S3 and DynamoDB for Terraform Backend

Parameters:
  S3BucketName:
    Type: String
    Description: The name of the S3 bucket to be created for storing Terraform state files.
    Default: gitops-tf-backend

  DynamoDBTableName:
    Type: String
    Description: The name of the DynamoDB table to be created for Terraform state locking.
    Default: GitopsTerraformLocks

Resources:
  TerraformBackendBucket:
    Type: 'AWS::S3::Bucket'
    Properties:
      BucketName: !Ref S3BucketName
      VersioningConfiguration:
        Status: Enabled

  TerraformBackendDynamoDBTable:
    Type: 'AWS::DynamoDB::Table'
    Properties:
      TableName: !Ref DynamoDBTableName
      AttributeDefinitions:
        - AttributeName: LockID
          AttributeType: S
      KeySchema:
        - AttributeName: LockID
          KeyType: HASH
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5
      SSESpecification:
        SSEEnabled: true

Outputs:
  TerraformBackendBucketName:
    Description: "S3 bucket name for the Terraform backend."
    Value: !Ref TerraformBackendBucket
    Export:
      Name: !Sub "${AWS::StackName}-TerraformBackendBucketName"

  TerraformBackendDynamoDBTableName:
    Description: "DynamoDB table name for the Terraform backend."
    Value: !Ref TerraformBackendDynamoDBTable
    Export:
      Name: !Sub "${AWS::StackName}-TerraformBackendDynamoDBTableName"

Steps (AWS Console)

Log in to the AWS Management Console and navigate to the CloudFormation service.
Click on Create Stack and select With new resources (standard).
In the Specify template section, select Upload a template file and upload the backend-resources.yml file.
Click Next, enter a Stack name (e.g., TerraformBackend), and proceed.
Click Next through the stack options, ensuring the correct IAM permissions are set.
Click Create stack and wait for the deployment to complete.

Once the stack is successfully created, go to Resources in the CloudFormation console to confirm that the S3 bucket and DynamoDB table have been provisioned.

Once the stack is deployed, navigate to the AWS CloudFormation console (https://console.aws.amazon.com/cloudformation) and check that:

Navigate to the AWS IAM Console.
The S3 bucket is listed under AWS S3.
The DynamoDB table is available in AWS DynamoDB.
Check that a new IAM Role (gitops2024Role) has been created.
Confirm that an OIDC Provider exists under IAM.

GitHub Actions Workflows: Automating CI/CD for Terraform

Example Mermaid Diagram:

graph TD;
    Developer -->|Push to GitHub| GitHub_Actions;
    GitHub_Actions -->|Run Formatting Checks| TFLint;
    TFLint -->|Run Security Checks| Trivy;
    GitHub_Actions -->|Run Terraform Plan| Terraform_Plan;
    GitHub_Actions -->|Run Cost Analysis| Infracost;
    Terraform_Plan -->|Approval Needed| Manual_Approval;
    Manual_Approval -->|Deploy Resources| Terraform_Apply;
    Terraform_Apply -->|Provision AWS Resources| AWS;
    AWS -->|Destroy if needed| Terraform_Destroy;

📌 GitHub Actions Workflow Execution Order:

1️⃣ TFLint & Trivy Security Scan – Ensures best practices & security compliance
2️⃣ Terraform Plan – Generates a preview of infrastructure changes
3️⃣ OPA Policy Checks & Infracost Analysis – Ensures compliance & cost awareness
4️⃣ Terraform Apply (manual trigger) – Deploys the infrastructure
5️⃣ Terraform Destroy (manual trigger) – Cleans up resources when no longer needed

Trivy Security Scan

This workflow scans the Terraform configuration for security vulnerabilities using Trivy.

TFLint Code Linter

Lints Terraform code for syntax errors and best practices.

Terraform Plan

Generates a Terraform execution plan and performs OPA policy checks.

Infracost Cost Analysis

Estimates the cost impact of Terraform changes.

Terraform Apply

Deploys infrastructure changes to AWS.

Terraform Destroy

Destroys all infrastructure deployed by Terraform.

Each of these workflows is triggered based on specific events and plays a key role in the CI/CD pipeline. Below is a breakdown of each workflow file, its purpose, how it works, and how it is triggered:

Trivy Security Scan

name: Trivy Security Scan

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main
      - feature
  workflow_dispatch:

permissions:
  contents: read
  pull-requests: write

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      - name: Install Trivy
        uses: aquasecurity/setup-trivy@v0.2.2
      - name: Run Trivy Terraform Security Scan
        run: |
          trivy fs --scanners misconfig --severity HIGH,CRITICAL --format table --exit-code 1 --ignorefile .trivyignore ./terraform | tee trivy-report.txt
      - name: Display Scan Report
        if: always()
        run: cat trivy-report.txt
      - name: Upload Scan Report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: tfsec-report
          path: trivy-report.txt
      - name: Post Scan Results as PR Comment
        if: always()
        uses: mshick/add-pr-comment@v2
        with:
          message: "🚨 **Terraform Security Scan Results** 🚨

``
$(cat trivy-report.txt)
``

📌 **Severity Levels:** `HIGH`, `CRITICAL`
🔍 **Ignored Findings:** Defined in `.trivyignore`
📄 **Full Report:** Check [tfsec-report](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})"
          repo-token: ${{ secrets.GITHUB_TOKEN }}

Purpose: Scans Terraform configuration for security vulnerabilities.

Triggers: Runs on push to main, pull_request to main or feature branches, and can be triggered manually via workflow_dispatch.

Key Steps:

Checks out the repository.
Installs Trivy security scanner.
Runs a scan for HIGH and CRITICAL misconfigurations.
Uploads scan results as an artifact and comments on PRs if issues are found.

TFLint Code Linter

name: Lint
on:
  push:
    branches: [ main ]
  pull_request:

jobs:
  tflint:
    runs-on: ${{ matrix.os }}
    defaults:
        run:
            working-directory: ./terraform

    strategy:
      matrix:
        os: [ubuntu-latest]

    steps:
    - uses: actions/checkout@v4
      name: Checkout source code

    - uses: actions/cache@v4
      name: Cache plugin dir
      with:
        path: ~/.tflint.d/plugins
        key: ${{ matrix.os }}-tflint-${{ hashFiles('.tflint.hcl') }}

    - uses: terraform-linters/setup-tflint@v4
      name: Setup TFLint
      with:
        tflint_version: v0.52.0
    - name: Show version
      run: tflint --version

    - name: Init TFLint
      run: tflint --init
      env:
        # https://github.com/terraform-linters/tflint/blob/master/docs/user-guide/plugins.md#avoiding-rate-limiting
        GITHUB_TOKEN: ${{ github.token }}

    - name: Run TFLint
      run: tflint -f compact

Purpose: Ensures Terraform code follows best practices and is formatted correctly.

Triggers: Runs on push to main and all pull_request events.

Key Steps:

Checks out the repository.
Caches TFLint plugins to optimize runs.
Initializes and runs TFLint to detect formatting and best-practice issues.

Terraform Plan

name: 'Plan'

on:
  push:
    branches: [ 'main' ]
  pull_request:
  workflow_dispatch:

permissions:
  contents: read
  id-token: write

jobs:

  terraform:
    name: 'Terraform'
    runs-on: ubuntu-latest
    environment: production
    defaults:
      run:
        shell: bash
        working-directory: ./terraform
    env:
      GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}


    steps:
    # Checkout the repository to the GitHub Actions runner
    - name: Checkout
      uses: actions/checkout@v4

    - name: Configure AWS Credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: ${{ secrets.ROLE_TO_ASSUME }}
        aws-region: eu-west-2

    # Install the latest version of Terraform CLI and configure the Terraform CLI configuration file with a Terraform Cloud user API token
    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v3

    # Initialize a new or existing Terraform working directory by creating initial files, loading any remote state, downloading modules, etc.
    - name: Terraform Init
      run: terraform init

    # Checks that all Terraform configuration files adhere to a canonical format
    - name: Terraform Format
      run: terraform fmt -check

    # Terraform Plan
    - name: Terraform Plan
      id: plan
      run: |
        terraform plan -out=plan.tfplan
        terraform show -json plan.tfplan > /tmp/plan.json
        cat /tmp/plan.json

    - name: Setup OPA
      uses: open-policy-agent/setup-opa@v2
      with:
        version: latest

    - name: Run OPA Tests
      run: |
        opaout=$(opa eval --data ../policies/instance-policy.rego --input /tmp/plan.json "data.terraform.deny" | jq -r '.result[].expressions[].value[]')
        [ -z "$opaout" ] && exit 0 || echo "$opaout" && gh pr comment ${{ github.event.pull_request.number }} --body "### $opaout" && exit 1

Purpose: Generates and evaluates a Terraform execution plan before applying changes.

Triggers: Runs on push to main, pull_request, and manually via workflow_dispatch.

Key Steps:

Checks out the repository.
Configures AWS credentials using OIDC.
Initializes Terraform and runs terraform plan, storing the output for later review.
Runs OPA (Open Policy Agent) tests against the Terraform plan to enforce security policies.

Infracost Cost Analysis

name: 'Run Infracost'
on:
  pull_request:
    types: [opened, synchronize, closed]
jobs:
  infracost-pull-request-checks:
    name: Infracost Pull Request Checks
    if: github.event_name == 'pull_request' && (github.event.action == 'opened' || github.event.action == 'synchronize')
    runs-on: ubuntu-latest
    environment: production
    permissions:
      contents: read
      pull-requests: write # Required to post comments
    steps:
      - name: Setup Infracost
        uses: infracost/actions/setup@v3
        with:
          api-key: ${{ secrets.INFRACOST_API_KEY }}
      - name: Checkout base branch
        uses: actions/checkout@v4
        with:
          ref: '${{ github.event.pull_request.base.ref }}'
      - name: Generate Infracost cost estimate baseline
        run: |
          infracost breakdown --path=. \
                              --format=json \
                              --out-file=/tmp/infracost-base.json
      - name: Checkout PR branch
        uses: actions/checkout@v4
      - name: Generate Infracost diff
        run: |
          infracost diff --path=. \
                          --format=json \
                          --compare-to=/tmp/infracost-base.json \
                          --out-file=/tmp/infracost.json
      - name: Post Infracost comment
        run: |
            infracost comment github --path=/tmp/infracost.json \
                                     --repo=$GITHUB_REPOSITORY \
                                     --github-token=${{ github.token }} \
                                     --pull-request=${{ github.event.pull_request.number }} \
                                     --behavior=update \
                                     --policy-path ./policies/cost.rego

Purpose: Estimates the cost impact of Terraform changes before they are applied.

Triggers: Runs on pull_request when a PR is opened, updated, or closed.

Key Steps:

Sets up Infracost with an API key.
Runs cost analysis for the current branch and compares it with the base branch.
Posts a cost breakdown as a comment on the PR.

Example Output:

Name	Quantity	Unit Cost	Monthly Cost
aws_instance.grafana	1	$8.32	$8.32
aws_s3_bucket.gitops-tf	1	$0.03	$0.03

Terraform Apply

name: 'Apply'

on: workflow_dispatch

permissions:
  contents: read
  id-token: write

jobs:

  terraform:
    name: 'Terraform'
    runs-on: ubuntu-latest

    defaults:
      run:
        shell: bash

    environment: production

    steps:
    # Checkout the repository to the GitHub Actions runner
    - name: Configure AWS Credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: ${{ secrets.ROLE_TO_ASSUME }}
        aws-region: eu-west-2
    - name: Checkout
      uses: actions/checkout@v4

    # Install the latest version of Terraform CLI and configure the Terraform CLI configuration file with a Terraform Cloud user API token
    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v3

    # Initialize a new or existing Terraform working directory by creating initial files, loading any remote state, downloading modules, etc.
    - name: Terraform Init
      run: terraform -chdir="./terraform" init

    # Checks that all Terraform configuration files adhere to a canonical format
    - name: Terraform Format
      run: terraform -chdir="./terraform" fmt -check

    # Generates an execution plan for Terraform
    - name: Terraform Plan
      run: terraform -chdir="./terraform" plan -input=false

    # Apply the Configuration
    - name: Terraform Apply
      run: terraform -chdir="./terraform" apply -input=false -auto-approve

Purpose: Applies Terraform changes to deploy the infrastructure.

Triggers: Runs only when manually triggered via workflow_dispatch.

Key Steps:

Checks out the repository.
Configures AWS credentials.
Initializes Terraform.
Runs terraform apply to deploy resources.

Terraform Destroy

name: 'Destroy'

on: workflow_dispatch

permissions:
  contents: read
  id-token: write

jobs:

  terraform:
    name: 'Terraform'
    runs-on: ubuntu-latest

    defaults:
      run:
        shell: bash

    environment: production

    steps:
    # Checkout the repository to the GitHub Actions runner
    - name: Configure AWS Credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: ${{ secrets.ROLE_TO_ASSUME }}
        aws-region: eu-west-2
    - name: Checkout
      uses: actions/checkout@v4

    # Install the latest version of Terraform CLI and configure the Terraform CLI configuration file with a Terraform Cloud user API token
    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v3

    # Initialize a new or existing Terraform working directory by creating initial files, loading any remote state, downloading modules, etc.
    - name: Terraform Init
      run: terraform -chdir="./terraform" init

    # Checks that all Terraform configuration files adhere to a canonical format
    - name: Terraform Format
      run: terraform -chdir="./terraform" fmt -check

    # Generates an execution plan for Terraform
    - name: Terraform Plan
      run: terraform -chdir="./terraform" plan -input=false

    # Apply the Configuration
    - name: Terraform Destroy
      run: terraform -chdir="./terraform" destroy -input=false -auto-approve

Purpose: Destroys deployed infrastructure when it's no longer needed.

Triggers: Runs only when manually triggered via workflow_dispatch.

Key Steps:

Checks out the repository.
Configures AWS credentials.
Initializes Terraform.
Runs terraform destroy to remove all resources.

Push Changes to a Feature Branch (in GitHub Codespaces)

Run the following commands inside your cloned GitHub repository:

# Create and switch to a new feature branch
git checkout -b feature-branch 

# Stage all modified files
git add . 

# Commit the changes with a meaningful message
git commit -m "Testing CI/CD"

# Push the feature branch to GitHub
git push origin feature-branch

Deploying AWS Resources with Terraform

Terraform Configuration Breakdown

The following Terraform files define the infrastructure to be deployed. Below, we explain each file and its role in the deployment process.

versions.tf

Defines the required Terraform version and provider constraints to ensure compatibility.

terraform {
  required_version = ">= 1.3.0"

  backend "s3" {
    bucket         = "gitops-tf-backend-mpcloudlab"
    key            = "terraform.tfstate"
    region         = "eu-west-2"
    dynamodb_table = "GitopsTerraformLocks"
  }
}

providers.tf

Configures the AWS provider and region settings.

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.69.0"
    }
    http = {
      source  = "hashicorp/http"
      version = "3.4.5"
    }
  }
}

provider "aws" {
  region = var.region
}

variables.tf

Declares input variables used throughout the Terraform configuration.

variable "region" {
  description = "AWS region where resources will be deployed"
  type        = string
  default     = "eu-west-2"
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}

terraform.tfvars

Defines default values for input variables.

region        = "eu-west-2"
instance_type = "t3.micro"

main.tf

Defines the main infrastructure resources to be deployed.

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}

resource "aws_vpc" "gitops_vpc" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "gitops-vpc"
  }
}

resource "aws_internet_gateway" "gitops_igw" {
  vpc_id = aws_vpc.gitops_vpc.id

  tags = {
    Name = "gitops-igw"
  }
}

resource "aws_route_table" "gitops_rt" {
  vpc_id = aws_vpc.gitops_vpc.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.gitops_igw.id
  }

  tags = {
    Name = "gitops-rt"
  }
}

resource "aws_subnet" "gitops_subnet" {
  vpc_id                  = aws_vpc.gitops_vpc.id
  cidr_block              = "10.0.1.0/24"
  map_public_ip_on_launch = true

  tags = {
    Name = "gitops-subnet"
  }
}

resource "aws_route_table_association" "gitops_rta" {
  subnet_id      = aws_subnet.gitops_subnet.id
  route_table_id = aws_route_table.gitops_rt.id
}

resource "aws_security_group" "gitops_sg" {
  name        = "gitops_sg"
  description = "Allow port 3000"
  vpc_id      = aws_vpc.gitops_vpc.id

  ingress {
    from_port   = 3000
    to_port     = 3000
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "gitops-sg"
  }
}

resource "aws_instance" "grafana_server" {
  ami                    = data.aws_ami.ubuntu.id
  instance_type          = var.instance_type
  subnet_id              = aws_subnet.gitops_subnet.id
  vpc_security_group_ids = [aws_security_group.gitops_sg.id]
  user_data              = file("userdata.tftpl")

  root_block_device {
    encrypted = true
  }

  metadata_options {
    http_tokens = "required"
  }

  tags = {
    Name = "grafana-server"
  }
}

check "grafana_health_check" {
  data "http" "test" {
    url = "http://${aws_instance.grafana_server.public_ip}:3000"
    retry {
      attempts = 10
    }
  }
  assert {
    condition     = data.http.test.status_code == 200
    error_message = "Grafana is inaccessible on port 3000."
  }
}

userdata.tftpl

Contains startup scripts that run when the EC2 instance is launched.

#!/bin/bash
sudo apt-get install -y apt-transport-https software-properties-common wget &&
sudo mkdir -p /etc/apt/keyrings/ &&
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null &&
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list &&
sudo apt-get update &&
sudo apt-get install -y grafana &&
sudo systemctl start grafana-server &&
sudo systemctl enable grafana-server

outputs.tf

Defines output values to retrieve important details after deployment.

output "grafana_ip" {
  value = "http://${aws_instance.grafana_server.public_ip}:3000"
}

Steps to Deploy Terraform Resources

Initialize Terraform

terraform init

Validate the configuration

terraform validate

Generate an execution plan

terraform plan

Apply the configuration

terraform apply -auto-approve

Retrieve outputs

terraform output

Verifying the Deployment

Check the AWS Console to confirm that resources have been created.
Use SSH or a browser to access the deployed EC2 instance.

Destroying the Infrastructure

When no longer needed, remove all resources:

terraform destroy -auto-approve

or use the Destroy GitHub Actions workflow

Running and Testing the Pipeline

How to Trigger Workflows

Automatic Triggers

Pushes and pull requests to main trigger the Plan, Security Scan, and Linter.
Pull requests trigger Infracost Cost Analysis.

Manual Triggers

terraform apply and terraform destroy require workflow_dispatch (manual execution via GitHub UI).
Manually trigger Apply workflow to deploy
Manually trigger Destroy workflow to clean up resources

Step-by-Step Execution

Push Changes to a New Feature Branch

git checkout -b feature-branch
git add .
git commit -m "Testing CI/CD"
git push origin feature-branch

Open a Pull Request

This will trigger security scans, cost analysis, and Terraform plan.
Review GitHub Actions Results
Check logs for security, cost, and linting errors.
Merge to branch when approved

Project Folder Structure

├── .github                  # GitHub Actions workflows
│   └── workflows
│       ├── apply.yml       # GitHub Actions workflow for applying Terraform changes
│       ├── destroy.yml     # GitHub Actions workflow for destroying Terraform resources
│       ├── infracost.yml   # Workflow for running Infracost to estimate Terraform costs
│       ├── plan.yml        # Workflow for running Terraform Plan
│       ├── tflint.yml      # Workflow for running TFLint to check Terraform syntax
│       └── tfsec.yml       # Workflow for running tfsec to check Terraform security
├── .gitignore              # Specifies files and directories to ignore in Git
├── .trivyignore            # Ignore file for Trivy security scanning
├── README.md               # Project documentation and setup instructions
├── cfn                     # CloudFormation templates for infrastructure
│   ├── backend-resources.yaml  # Defines S3 and DynamoDB resources for Terraform backend
│   └── oidc-role.yaml      # CloudFormation template to create OIDC role for GitHub Actions
├── install terraform.txt   # Instructions for installing Terraform
├── policies                # Policies for security and cost analysis
│   ├── cost.rego           # OPA policy for Infracost cost enforcement
│   ├── instance-policy.rego # OPA policy for Terraform instance compliance
│   └── plan.json           # JSON representation of Terraform Plan for policy validation
└── terraform               # Terraform configuration files
    ├── main.tf             # Defines core infrastructure (VPC, EC2, Security Groups, etc.)
    ├── outputs.tf          # Specifies Terraform output values
    ├── providers.tf        # Configures Terraform providers (AWS, HTTP, etc.)
    ├── terraform.tfvars    # Defines Terraform input variable values
    ├── userdata.tftpl      # Cloud-init script for configuring EC2 instances
    ├── variables.tf        # Declares Terraform input variables
    └── versions.tf         # Specifies required Terraform and provider versions

Conclusion

By implementing this GitOps pipeline, we achieve:

🚀 Automation – Eliminates manual deployments 🔒 Security – Enforces compliance using OPA & Trivy 💰 Cost Awareness – Monitors infrastructure costs via Infracost

This approach provides scalability, consistency, and security for managing AWS infrastructure.

If you have questions, feedback, or suggestions, feel free to reach out!

Share on Share on