Running services in AWS free tier

AWS is the most popular major cloud provider, and getting starting with their free tier is quite easy. My own experience is mainly with azure for the projects I have done, so here is how I have come up to streamline the use of the free tier to test and run services.

There are multiple ways of running code in AWS, probably the most cost effective is lambda functions if your runtime and memory requirements fit it with the paradigm function as a service, but in my case I wanted something more generic, that I could install and run anything, since I am planning on creating a mini data platform and I am going to be testing multiple services. For that EC2 seems the best, a general purpose VM in AWS that you can use for anything, but the trick is in how we can make it so we can stand up and tear down the VM and services quite easily and make sure we are not forgetting them and getting billed.

The goal here is to have a EC2 running whatever you want, making all the set up as seamless as possible. In my case for example, I am using it to run a series of pipelines every morning.

I am going to assume some basic knowledge on terraform and at least aws-cli setup, there are tones of guides for that. Also I would recommend not going to this guide directly or at least keep in mind that you should check more of AWS (create budget alerts if you haven’t!).

Infrastructure

As much as possible, we are going to leverage the free tier from AWS to keep the costs at a minimum. These are the free tier used (as of December 2024):

1 year of 750 hours per month of EC2
Parameter store standard
Event Bridge scheduler up to 14,000,000 invocations per month for free
Lambda up to 1,000,000 requests per month

⚒️ projects/blog/attachments/Running services in AWS free tier 2024-12-10 14.41.42.excalidraw — Architecture of the setup

This looks a bit complicated for running a VM, and you are right, as mentioned, I am using this to schedule a workflow, something that will run daily, so if you don’t need that you can get rid of it, but this setup has the following advantages:

IaC, easy to setup new services, just change the configuration and run it again
Schedule to start the service
Pessimistic stop of the instance using lambda on a schedule (in case the instance hangs)
Configuration in parameter store, decoupled from infra
Flexibility on the services or code to run (for example, I am downloading a repo as part of the instance setup to run my pipelines)

EC2 initialisation

Now to the main part, to create the EC2 instances. The terraform code is pretty straightforward. It uses cloud-init to install dependencies and set up the instance. It allows us to use a yaml configuration file to define what modules will run, a script and even a shutdown delay. There are more options to customise an EC2 instance initialisation, you can find all the information in the documentation

There are a few parts to customise the behaviour. First, to change the startup of the function, modify the cloud-init.yml The main parts to change here are:

package to setup the dependencies
write_files script is the main code that will prepare and run the service
power_state to shutdown after everything has run or setup a delay so it automatically turns off after a time

The script has a few extra steps aside from starting the service that you can use as an example:

It sets up a swap file. The reason is that the free tier are very limited in memory and if they run out they will hang. Adding swap is a cheap way to make sure it does not hang at the cost of some performance
It reads from parameter store a configuration file, this decouples the instance from the configuration

The other part to adjust is in the terraform, this example only creates the policy resource.aws_iam_role_policy.policy_ssm, but if you want the EC2 instances to have access to more services (for example an S3 bucket), you will need to create more policies to grant those permissions

Now, given this, here are the 2 main files:

ec2.tf:

//
// IAM role
//
resource "aws_iam_role" "role" {
  name = "role"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}

resource "aws_iam_instance_profile" "profile" {
  name = "profile"
  role = "${aws_iam_role.role.name}"
}

resource "aws_iam_role_policy" "policy_ssm" {
  name = "policy_ssm"
  role = "${aws_iam_role.role.id}"

  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ssm:*"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}
EOF
}

//
// EC2 Instance
//
data "cloudinit_config" "server_config" {
  gzip          = true
  base64_encode = true
  part {
    content_type = "text/cloud-config"
    content = file("${path.module}/cloud-init.yml")
  }
}

resource "aws_instance" "ec2_instance" {
  ami           = "ami-00385a401487aefa4"
  instance_type = "t2.micro"
  key_name = "health_data_load"

  iam_instance_profile = "${aws_iam_instance_profile.profile.name}"

  vpc_security_group_ids = [aws_security_group.ec2_instance_security_group.id]
  user_data = data.cloudinit_config.server_config.rendered
  user_data_replace_on_change = true

  tags = {
    Name = "ec2_instance"
  }

}

//
// Security group
//
resource "aws_security_group" "ec2_instance_security_group" {
  name        = "pbzhdl-sg"
  description = "allow http port"
  vpc_id      = "vpc-0eb182d38a23b1b64"

  ingress {
    description = "allow ssh"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "pbzhdl-sg"
  }
}

//
// Parameter store
//
resource "aws_ssm_parameter" "instance_id" {
  name  = "instance_id"
  type  = "String"
  value = aws_instance.ec2_instance.id
}

cloud-init.yml:

#cloud-config

# The modules that run in the 'final' stage
cloud_final_modules:
 - package-update-upgrade-install
 - write-files-deferred
 - puppet
 - chef
 - mcollective
 - salt-minion
 - reset_rmc
 - refresh_rmc_and_interface
 - rightscale_userdata
 - scripts-vendor
 - scripts-per-once
 - scripts-per-boot
 - scripts-per-instance
 - scripts-user
 - ssh-authkey-fingerprints
 - keys-to-console
 - install-hotplug
 - phone-home
 - final-message
 - [power_state_change, always]


write_files:
  - content: |
      #!/bin/bash

      # Enable swapfile
      dd if=/dev/zero of=/swapfile bs=128M count=16  # 128M * 16 = 2Gb
      chmod 600 /swapfile
      mkswap /swapfile
      swapon /swapfile

      # Setup pipeline environment
      mkdir -p /run/workdir

      # Parameters
      aws ssm get-parameter --name workflow_dotenv --with-decrypt --query "Parameter.Value" --output text > /run/workdir/.env

      cd /run/workdir
      # Any other commands to run service

    path: /var/lib/cloud/scripts/per-boot/myScript.sh
    permissions: "0755"


packages:
  - python3.11
  - python3.11-pip
  - git


power_state:
 delay: "now" # Or a number of minutes, eg 30 for 30 minutes
 mode: poweroff
 message: Bye Bye
 timeout: 120
 condition: /bin/true

With these files, after running terraform apply, you should have and instance running that will shutdown after running the service. Only with this you can start playing, but in the next step we will create lambdas to start and stop them.

The stop lambda is important here, since the instances are so small, it is very easy to run out of memory. If that happens the instance will hang and it might not stop, so the lambda makes sure in the worst case it stopped externally.

Lambda for management

The lambda functions for starting and stopping are standard from the AWS documentation, but they make use of parameter store to pick up the instance id, which is automatically created during the terraform deployment.

You can adjust the schedule trigger in the terraform. Right now it is setup to start at 10:00 and force stop it at 11:00 if it has not finished by itself. Here is the code:

stop_lambda.py:

import boto3
import os

region = os.environ["AWS_REGION"]
ssm = boto3.client("ssm", region_name=region)

def lambda_handler(event, context):
    get_response = ssm.get_parameter(Name="instance_id")
    instances = [get_response['Parameter']["Value"]]
    ec2.stop_instances(InstanceIds=instances)
    print('stopped your instances: ' + str(instances))

start_lambda.py:

import boto3
import os

region = os.environ["AWS_REGION"]
ssm = boto3.client("ssm", region_name=region)

def lambda_handler(event, context):
    get_response = ssm.get_parameter(Name="instance_id")
    instances = [get_response['Parameter']["Value"]]
    ec2.start_instances(InstanceIds=instances)
    print('started your instances: ' + str(instances))

start_lambda.tf:

//
// Schedule event
//
resource "aws_cloudwatch_event_rule" "start_lambda" {
  name                  = "run_start_lambda"
  description           = "Schedule lambda function"
  schedule_expression   = "cron(00 10 * * ? *)"

  tags = {
    "app" = "healthdata"
  }
}

resource "aws_cloudwatch_event_target" "start_lambda_target" {
  target_id = "start_lambda_target"
  rule      = aws_cloudwatch_event_rule.start_lambda.name
  arn       = aws_lambda_function.start_lambda.arn
  input = "{\"ssm_parameter\": \"load_instance_id\"}"
}

resource "aws_lambda_permission" "allow_cloudwatch" {    
    statement_id = "AllowExecutionFromCloudWatch"
    action = "lambda:InvokeFunction"
    function_name = aws_lambda_function.start_lambda.function_name
    principal = "events.amazonaws.com"
    source_arn = aws_cloudwatch_event_rule.start_lambda.arn
}

//
// Lambda
//
data "archive_file" "start_lambda" {
  type        = "zip"
  source_file = "${path.module}/start_lambda.py"
  output_path = "start_lambda_function_payload.zip"
}

resource "aws_lambda_function" "start_lambda" {
  filename      = data.archive_file.start_lambda.output_path // "start_lambda_function_payload.zip"
  function_name = "start_lambda"
  role          = aws_iam_role.iam_for_lambda.arn
  handler       = "start_lambda.lambda_handler"

  source_code_hash = data.archive_file.start_lambda.output_base64sha256

  runtime = "python3.10"

  environment {
    variables = {
      instance_parameter = "load_instance_id"
    }
  }
  tags = {
    "app" = "healthdata"
  }
}


//
// Iam role
//
data "aws_iam_policy_document" "assume_role" {
  statement {
    effect = "Allow"

    principals {
      type        = "Service"
      identifiers = ["lambda.amazonaws.com"]
    }

    actions = ["sts:AssumeRole"]
  }
}

resource "aws_iam_role" "iam_for_lambda" {
  name               = "iam_for_lambda"
  assume_role_policy = data.aws_iam_policy_document.assume_role.json
}


resource "aws_iam_role_policy" "iam_for_lambda_policy_ssm" {
  name = "iam_for_lambda_policy_ssm"
  role = "${aws_iam_role.iam_for_lambda.id}"

  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ssm:*"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy" "iam_for_lambda_policy_ec2" {
  name = "iam_for_lambda_policy_ec2"
  role = "${aws_iam_role.iam_for_lambda.id}"

  policy = <<EOF
{  
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:Start*",
        "ec2:Stop*"
      ],
      "Resource": "*"
    }
  ]
}
EOF
}

To keep it simple I have only added the start lambda terraform here, but the stop lambda is exactly the same, just change start for stop in the code and change the schedule (or if you want, just add parameters and turn it into a module).

To wrap up

This is a very technical post, with a lot of code. In the future I will add a repository for these examples. I am using this code to run my own mini orchestration daily for my personal data platform and after the few initial hiccups, it is running smoothly now. Hopefully it will be useful for anyone wanting to start with AWS. I am running an orchestration, but you can use it to try any kind of services that you can install and run on an EC2 instance.

Finally, I am not an expert in AWS, so there might be some best practices or other AWS details missing, please comment about any gap you find! This is a work in progress for me :)

Tags: aws ec2 terraform aws-lambda