aws cdk

Serverless LLM on AWS

Quite recently I watched a short course created by DeepLearning.AI titled “Serverless LLM apps with Amazon Bedrock“. It is a very good course for beginners, well crafted, with good examples, and clear explanations by Mike Chambers. He goes and creates a small app using AWS Lambda, AWS S3, and Amazon Bedrock that does text summarisation of recorded conversations. I wanted to add something to the topic and provide a fully working example you can deploy easily using AWS CDK.

Architecture diagram
aws cdk devops python

Deploy Sagemaker Endpoint using AWS CDK

I was looking for a way to deploy a custom model to Sagemaker. Unfortunately, my online searches failed to find anything that was not using Jupiter notebooks. I like them but this way of deploying models is not a reproducible way nor it is scalable.

After a couple of hours of looking, I decided to do it myself. Here comes a recipe for deploying a custom model to Sagemaker using AWS CDK.

The following steps assume you have knowledge of CDK and Sagemaker. I’ll try to explain as much as I can but if anything is unclear please refer to the docs.


  1. Prepare containerised application serving your model.
  2. Create Sagemaker model.
  3. Create Sagemaker Endpoint configuration.
  4. Deploy Sagemaker Endpoint.

Unfortunately, AWS CDK does not support higher-level constructs for Sagemaker. You have to use CloudFormation constructs which start with the prefix Cfn. Higher-level constructs for Sagemaker are not on the roadmap as of March 2021.

Dockerfile to serve model

First thing is to have your app in a container form, so it can be deployed in a predictable way. It’s difficult to help with this step as each model may require different dependencies or actions. What I can recommend is to go over This page explains the steps required to prepare a container that can serve a model on Sagemaker. It may also be helpful to read this part on how your docker image will be used.

Define Sagemaker model

Once you have your model in a container form it is time to create a Sagemaker model. There are 3 elements to a Sagemaker model:

  • Container definition
  • VPC configuration for a model
  • Model definition

Adding container definition to your app is simple (the hard part of creating a docker image is already done). The container definition will be used by the Sagemaker model.

asset = DockerImageAsset(

primary_container_definition = sagemaker.CfnModel.ContainerDefinitionProperty(
)Code language: PHP (php)

Creating Vpc is pretty straightforward, you have to remember about creating public and private subnets.

vpc = ec2.Vpc(
            name="public-model-subnet", subnet_type=ec2.SubnetType.PUBLIC
            name="private-model-subnet", subnet_type=ec2.SubnetType.PRIVATE
model_vpc_config = 
        subnets=[s.subnet_id for s in vpc.private_subnets],
        )Code language: PHP (php)

Creating a model is putting all created things together.

model = sagemaker.CfnModel(
)Code language: PHP (php)

At this point, cdk deploy would create Sagemaker model with an ML model of your choice.

Define endpoint configuration

We are not done yet as the model has to be exposed. Sagemaker Endpoint is perfect for this and in the next step we create endpoint configuration.

Endpoint configuration describes resources that will serve your model.

model_endpoint_config = sagemaker.CfnEndpointConfig(
)Code language: PHP (php)

Create Sagemaker Endpoint

Last step is extremely simple. We take the configuration created earlier and create an endpoint.

model_endpoint = sagemaker.CfnEndpoint(
    "model-endpoint", endpoint_config_name=model_endpoint_config.attr_endpoint_config_name,
)Code language: PHP (php)


Now you may call cdk deploy and the model is up and running on AWS Sagemaker 🙂