RELEASES Dec. 10, 2025, 11:30 p.m.

Complete Guide to Serverless Image Processing on AWS Using Lambda, S3, and Python

Welcome to today’s deep‑dive tutorial! We’ll walk through building a fully serverless image‑processing pipeline using AWS Lambda, S3, and Python. By the end of this guide, you’ll have a production‑ready solution that can resize, watermark, and store images on the fly. Grab a coffee, fire up your IDE, and let’s get our hands dirty.

Why Go Serverless for Image Processing?

Traditional image pipelines often rely on dedicated VMs or containers, which means you pay for idle compute time. Serverless eliminates that waste by scaling automatically—only the milliseconds you actually use are billed. Moreover, Lambda’s integration with S3 makes event‑driven workflows trivial to implement.

Beyond cost savings, serverless architectures improve reliability. AWS handles the underlying infrastructure, so you get built‑in redundancy across multiple Availability Zones. This translates to higher uptime for your users, especially when you’re serving media on a global platform.

Prerequisites

Before we dive into code, make sure you have the following ready:

A valid AWS account with permissions to create S3 buckets, Lambda functions, and IAM roles.
Python 3.11 installed locally (the runtime we’ll target for Lambda).
The AWS CLI configured with aws configure.
Docker installed if you plan to test the Lambda function locally.

If any of these items are missing, pause the tutorial and set them up now. Skipping this step often leads to confusing “permission denied” errors later on.

High‑Level Architecture

The pipeline consists of three core components:

S3 Bucket (Source) – Users upload raw images here.
Lambda Function – Triggered by S3 events, it processes the image (resize, watermark, format conversion).
S3 Bucket (Destination) – Stores the processed assets, optionally with a different storage class for cost optimization.

All communication is event‑driven, which means there’s no need for polling or cron jobs. The diagram below illustrates the data flow.

Pro tip: Keep source and destination buckets separate. This simplifies lifecycle policies and prevents accidental overwrites.

Creating the S3 Buckets

Open your terminal and run the following commands to create two buckets—replace my‑raw‑images and my‑processed‑images with unique names.

aws s3api create-bucket --bucket my-raw-images --region us-east-1
aws s3api create-bucket --bucket my-processed-images --region us-east-1

After creation, enable versioning on the source bucket. Versioning protects against accidental deletions and gives you a rollback path.

aws s3api put-bucket-versioning \
  --bucket my-raw-images \
  --versioning-configuration Status=Enabled

Setting Up the IAM Role for Lambda

The Lambda function needs permissions to read from the source bucket, write to the destination bucket, and log to CloudWatch. Create a role with the following policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::my-raw-images/*",
        "arn:aws:s3:::my-processed-images/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*"
    }
  ]
}

Save this JSON as lambda-policy.json and create the role.

aws iam create-role \
  --role-name ImageProcessorRole \
  --assume-role-policy-document file://trust-policy.json

aws iam put-role-policy \
  --role-name ImageProcessorRole \
  --policy-name ImageProcessorPolicy \
  --policy-document file://lambda-policy.json

Writing the Lambda Function – Part 1 (Resize & Watermark)

Now for the fun part: the Python code that actually manipulates images. We’ll use the Pillow library because it’s lightweight and well‑supported in Lambda layers.

import json
import boto3
from io import BytesIO
from PIL import Image, ImageDraw, ImageFont

s3 = boto3.client('s3')

def lambda_handler(event, context):
    # Extract bucket and key from the S3 event
    source_bucket = event['Records'][0]['s3']['bucket']['name']
    source_key = event['Records'][0]['s3']['object']['key']

    # Download the image into memory
    response = s3.get_object(Bucket=source_bucket, Key=source_key)
    raw_image = response['Body'].read()
    img = Image.open(BytesIO(raw_image))

    # Resize while maintaining aspect ratio
    max_size = (800, 800)
    img.thumbnail(max_size, Image.ANTIALIAS)

    # Add a semi‑transparent watermark
    watermark_text = "© Codeyaan"
    draw = ImageDraw.Draw(img)
    font = ImageFont.truetype("/opt/font/Roboto-Bold.ttf", 36)
    text_width, text_height = draw.textsize(watermark_text, font=font)
    position = (img.width - text_width - 10, img.height - text_height - 10)
    draw.text(position, watermark_text, font=font, fill=(255,255,255,128))

    # Save processed image to a BytesIO buffer
    buffer = BytesIO()
    img.save(buffer, format='JPEG', quality=85)
    buffer.seek(0)

    # Define destination bucket and key
    dest_bucket = 'my-processed-images'
    dest_key = f"processed/{source_key.rsplit('/', 1)[-1].rsplit('.', 1)[0]}.jpg"

    # Upload processed image
    s3.put_object(
        Bucket=dest_bucket,
        Key=dest_key,
        Body=buffer,
        ContentType='image/jpeg'
    )

    return {
        'statusCode': 200,
        'body': json.dumps('Image processed successfully!')
    }

Notice the use of /opt/font/Roboto-Bold.ttf. This path points to a custom Lambda layer where we’ll bundle the font file. Including a layer keeps the deployment package small and reusable across multiple functions.

Packaging and Deploying the Lambda Function

First, create a requirements.txt with Pillow.

echo "Pillow==10.2.0" > requirements.txt

Next, build a deployment package that includes the dependencies and our handler.

mkdir -p package
pip install -r requirements.txt -t package/
cp lambda_function.py package/
cd package
zip -r ../lambda_image_processor.zip .
cd ..

Upload the zip file to Lambda, attaching the previously created IAM role.

aws lambda create-function \
  --function-name ImageProcessor \
  --runtime python3.11 \
  --role arn:aws:iam::YOUR_ACCOUNT_ID:role/ImageProcessorRole \
  --handler lambda_function.lambda_handler \
  --zip-file fileb://lambda_image_processor.zip \
  --timeout 30 \
  --memory-size 1024

Connecting S3 Events to Lambda

We want the function to fire whenever a new object lands in the source bucket. Use the following command to add an event notification.

aws s3api put-bucket-notification-configuration \
  --bucket my-raw-images \
  --notification-configuration '{
      "LambdaFunctionConfigurations": [
        {
          "Id": "ImageUploadTrigger",
          "LambdaFunctionArn": "arn:aws:lambda:us-east-1:YOUR_ACCOUNT_ID:function:ImageProcessor",
          "Events": ["s3:ObjectCreated:*"]
        }
      ]
    }'

Don’t forget to grant S3 permission to invoke the Lambda function.

aws lambda add-permission \
  --function-name ImageProcessor \
  --principal s3.amazonaws.com \
  --statement-id s3invoke \
  --action "lambda:InvokeFunction" \
  --source-arn arn:aws:s3:::my-raw-images \
  --source-account YOUR_ACCOUNT_ID

Testing the Pipeline Locally – Part 2 (Mock Event)

Running Lambda in the cloud is fast, but debugging is easier locally. Docker provides an official Lambda runtime image you can use for testing.

docker run -v $(pwd):/var/task \
  -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
  -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
  -e AWS_DEFAULT_REGION=us-east-1 \
  -p 9000:8080 \
  public.ecr.aws/lambda/python:3.11 \
  lambda_function.lambda_handler

In another terminal, simulate an S3 event JSON file (event.json) and invoke the local endpoint.

cat > event.json <<EOF
{
  "Records": [
    {
      "s3": {
        "bucket": {"name": "my-raw-images"},
        "object": {"key": "sample.jpg"}
      }
    }
  ]
}
EOF

curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" \
  -d @event.json

If everything is wired correctly, you’ll see a 200 response and a new file appear in the my-processed-images bucket.

Pro tip: Use aws s3 sync to upload a batch of test images quickly. It also respects the bucket’s versioning settings.

Real‑World Use Case: E‑Commerce Thumbnail Generation

Imagine an online marketplace where sellers upload high‑resolution product photos. Storing those originals is fine, but you need thumbnails for search results, category pages, and mobile views. Our serverless pipeline can generate three sizes on demand: small (200×200), medium (500×500), and large (800×800).

To extend the Lambda function, add a loop that saves each size under a distinct prefix.

sizes = {
    'small': (200, 200),
    'medium': (500, 500),
    'large': (800, 800)
}

for label, max_dim in sizes.items():
    resized = img.copy()
    resized.thumbnail(max_dim, Image.ANTIALIAS)
    buffer = BytesIO()
    resized.save(buffer, format='JPEG', quality=85)
    buffer.seek(0)

    dest_key = f"{label}/{source_key.rsplit('/', 1)[-1].rsplit('.', 1)[0]}.jpg"
    s3.put_object(
        Bucket=dest_bucket,
        Key=dest_key,
        Body=buffer,
        ContentType='image/jpeg',
        CacheControl='max-age=31536000'  # 1 year caching for CDN
    )

Now every upload automatically yields a set of ready‑to‑serve thumbnails, drastically reducing front‑end latency and bandwidth costs.

Performance Optimizations

Lambda’s cold start can add a few hundred milliseconds, especially when loading heavy libraries. To mitigate this:

Keep the deployment package under 50 MB (uncompressed) so the runtime can fetch it quickly.
Leverage Lambda layers for shared dependencies like Pillow, so the function code stays lean.
Set memory-size to a higher value (e.g., 2048 MB) if you’re processing large images; CPU scales proportionally, often cutting execution time in half.

Another tip: Use ImageFile.LOAD_TRUNCATED_IMAGES = True to gracefully handle corrupted uploads without crashing the entire pipeline.

Security Best Practices

Never embed AWS credentials in your code. Rely on the IAM role attached to the Lambda function, and enable least‑privilege policies. If you need to restrict who can upload to the source bucket, apply a bucket policy that only allows specific IAM users or roles.

For compliance‑heavy workloads, enable server‑side encryption (SSE‑S3 or SSE‑KMS) on both buckets. Adding the following flag during bucket creation activates SSE‑S3 automatically:

aws s3api create-bucket \
  --bucket my-raw-images \
  --region us-east-1 \
  --object-lock-enabled-for-bucket

Monitoring, Logging, and Alerting

CloudWatch automatically captures Lambda logs, but you’ll want structured metrics for operational insight. Add a few put_metric_data calls to emit custom metrics like ProcessedImageCount and ProcessingLatencyMs.

import time
import boto3

cloudwatch = boto3.client('cloudwatch')

start = time.time()
# ... image processing logic ...
duration_ms = int((time.time() - start) * 1000)

cloudwatch.put_metric_data(
    Namespace='ImageProcessing',
    MetricData=[
        {
            'MetricName': 'ProcessedImageCount',
            'Value': 1,
            'Unit': 'Count'
        },
        {
            'MetricName': 'ProcessingLatencyMs',
            'Value': duration_ms,
            'Unit': 'Milliseconds'
        }
    ]
)

Set up a CloudWatch alarm on ProcessingLatencyMs to trigger an SNS notification if latency exceeds a threshold (e.g., 2000 ms). This gives you early warning before users notice slow image loads.

Cost Estimation

A single 1 MB image processed in 500 ms with 512 MB memory costs roughly $0.000000208 per invocation. Even at 10 k daily uploads, you’re looking at under $0.08 per month for compute. Add S3 storage and request costs, and the total remains well under $5 for most small‑to‑medium workloads.

Remember to enable S3 lifecycle rules to transition older processed images to Glacier or delete them after a set retention period. This can shave off additional storage fees.

Extending the Pipeline: Adding AI‑Based Enhancements

If you need more sophisticated processing—like background removal or auto‑enhancement—consider invoking Amazon Rekognition or a SageMaker endpoint from within the same Lambda function. The pattern remains identical: download the image, call the AI service, receive the transformed bytes, and store the result back to S3.

Because Lambda can only run for 15 minutes, keep AI calls lightweight. For heavy models, offload to SageMaker Batch Transform jobs triggered by an SQS queue, while the Lambda function simply enqueues the work.

Deploying with Infrastructure as Code

Manual CLI steps are great for learning, but production environments demand repeatable deployments. Use AWS CloudFormation or the CDK to codify the entire stack. Below is a minimal CDK snippet in Python that provisions the buckets, role, and Lambda function.

from aws_cdk import (
    Stack,
    aws_s3 as s3,
    aws_lambda as _lambda,
    aws_iam as iam,

Share this article