Serverless Framework Custom Docker Containers For AWS Lambda

Why to use custom docker images in AWS lambda ?¶

AWS lambda provides runtimes for programming languages like python, java, ruby, etc.
Let's see list of runtimes that AWS lambda currently supports.
Python runtimes
- python3.9
- python3.8
- python3.7
- python3.6
Node.js runtimes
- nodejs16.x
- nodejs14.x
- nodejs12.x
Ruby runtimes
- ruby2.7
Java runtimes
- java11
- java8.al2
- java8
Go runtimes
- go1.x
.NET runtimes
- dotnet6
- dotnetcore3.1
If we want a runtime that requires both python and java programming languages then we will need to use the custom runtime for aws lambda function. Because AWS currently didn't have a support for that.

Pre-requisites¶

using custom runtime docker container with AWS lambda & serverless framework¶

we will be using serverless framework to deploy our aws lambda function which uses custom docker container.
Let's say we have a usecase that we want to extract the tables from PDF using tabula-py
To do that we need a python and java development environments as tabula-py is a wrapper java-tabula.
Let's create severless configuration and Dockerfile to solve above problem.

Directory structure¶

.
├── pdf-table-extract
│   ├── Dockerfile
│   ├── main.py
│   └── requirements.txt
└── serverless.yml

Let's define these files

serverless.yml

service: custom-service

provider:
  name: aws
  ecr:
    # In this section you can define images that will be built locally and uploaded to ECR
    images:
      extract_pdf_tables:
        path: ./pdf-table-extract

functions:
  pdf_tables_to_json:
    image:
      name: extract_pdf_tables

pdf-table-extract/Dockerfile

FROM public.ecr.aws/lambda/python:3.9-x86_64
RUN yum install -y java-17-amazon-corretto
COPY requirements.txt ${LAMBDA_TASK_ROOT}/requirements.txt
RUN pip3 install -r ${LAMBDA_TASK_ROOT}/requirements.txt
COPY . ${LAMBDA_TASK_ROOT}
CMD [ "main.lambda_handler" ]

pdf-table-extract/requirements.txt

certifi==2022.5.18.1
charset-normalizer==2.0.12
distro==1.7.0
idna==3.3
numpy==1.22.4
pandas==1.4.2
python-dateutil==2.8.2
pytz==2022.1
requests==2.27.1
six==1.16.0
tabula-py==2.3.0
urllib3==1.26.9

pdf-table-extract/main.py

import tabula


def lambda_handler(event, context):
    cols = ["col1", "col2"]
    url = "https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/arabic.pdf"
    pdf_df = tabula.read_pdf(url)
    df = pdf_df[0]
    df.columns = cols
    df.fillna('', inplace=True)
    return {"data": df.to_dict("records")}

if __name__ == "__main__":
    print(lambda_handler({}, {}))

Now, Let's deploy it to aws lambda with below commands.

sls deploy

It will deploy the lambda function pdf_tables_to_json to AWS cloud. To verify it login to your aws console and goto lambda functions and search for the function pdf_tables_to_json and test it.

Note: It will throw an error if your computer did not have the docker and serverless installed

Why to use custom docker images in AWS lambda ?¶

Pre-requisites¶

using custom runtime docker container with AWS lambda & serverless framework¶

Directory structure¶

References:¶