5 Things you should know before using Lambda

Sergej Jakovljev
Axilis
Published in
7 min readOct 17, 2016

--

Like any relatively new tool, there are a lot of small things about AWS Lambda that you don’t figure out before really diving into it. Below I outline 5 areas where you can expect challenges, and ways to address them.

1. Code structure

Unlike traditional, monolithic applications where setup is typically straightforward, seemingly simple things on Lambda may require some workarounds.

Environment variables

Lambda doesn’t support typical environment variables, but there are some alternatives:

  • Stage variables — Although stage variables only work for functions that are called through API Gateway, they are an easy way to set specific variables for each development stage.
  • Build phase injection — You can store environment variables in a build service. When the build succeeds, it will insert the appropriate variables and deploy the code.
  • Configuration service — A centralized service can store all settings. Then using one of the previous methods to configure access to the service, a function can retrieve needed information. This approach brings the additional benefit of having all of your configuration in one place.

In practice, you’ll probably end up using some combination of the options above. A significant advantage of microservices is that you don’t need to make such decisions for everything upfront — you can implement each microservice however makes the most sense.

Functions end up simple, integration becomes hard

Since services end up being simple functions, it’s easy to unit test them. However, that also means that conventional, manual local testing of the whole stack is significantly harder as it requires configuration of all of the interdependent services.

In Testing Microservice Architecture, Martin Fowler provides valuable insights into the different approaches. I recommend reading it.

Initial configuration overhead

While each service has a smaller set of dependencies compared to a monolithic architecture, much more effort is required in aggregate to configure all of the services.

Because each microservice is an isolated unit, each will require its own separate initial setup including repository permissions, code integration, linter, test runner, monitoring tools, etc. Furthermore, current deployment tools are still missing some settings (such as VPC-s when using Claudiajs), and so you’ll need to manually configure certain things.So, in the beginning, it may seem as there is not much progress.

After you finish all of the configuration, however, the benefits will soon become evident: build times will be much shorter since each service has its own set of dependencies. Service separation also makes it easier for multiple people to work on different parts of the project at the same time.

2. Using Amazon RDS with Lambda

RDS needs to reside inside the same VPC (Virtual Private Cloud) as the Lambda functions to be used with it.

If the Lambda functions only need to access RDS, the default VPC setup will suffice since input to and output from Lambda will be piped through API Router.

However, if the Lambda functions need to access both Amazon RDS and other web services such as remote APIs, further VPC configuration will be required. It will be necessary to add NAT gateway and setup routes to enable entities inside VPC to access the Internet.

While the effort required to run a “Hello World” function is small, integrating VPC and Lambda may take some time, and has considerable implications:

  • Automatic scaling of the number of instances becomes limited by VPC subnet address space within Lambda function resides. The default VPC contains subnets with about 250 addresses remaining. While that may seem like a lot, it is shared among all instances of all services connected to a given subnet.
  • Additional setup is required to achieve redundancy by configuring functions and VPC in multiple availability zones.
  • Lambda functions that run within a VPC require additional start-up time in order to connect the ENI (Elastic Network Interface) to the instance.

3. Lambda limits and pricing

Some of Lambda’s limits may not be immediately apparent.

When compressed into an archive, the code for each service cannot exceed 50MB (250MB uncompressed). While this may seem like a lot, if you’re not careful in your use of Node packages, it is quite possible to exceed it. Additionally, there is a 75GB limit within the entire region, which includes all versions of all of the Lambda functions. Since deploying a new version doesn’t automatically delete the old ones — unless you delete them manually — they will remain and take up space.

There is also a limit of 100 concurrent executions per account per region that can be increased by making a request to AWS in the support center.

Pricing

Although Lambda may be inexpensive — and, in many cases, is — there are a number of additional costs when using it.

For example, using VPC to enable database and internet access for a given service, requires at least one NAT Gateway plus the associated data transfer costs. All services which initiate external data transfers (e.g. external API) are charged at the EC2 data transfer rate. There will likely be some database and S3 hosting expenses as well.

So when estimating the total cost of using Lambda, make sure you include all of the needed AWS services in the calculation.

Lambda functions become frozen

A common problem with service-oriented architectures is that queries that could be done on a single database using joins end up having to access multiple services.

One of applications more complex queries that could be done entirely in the database, when split into multiple services ended up taking an order of magnitude longer. Also joining has to be done manually since each service should be able to change data store at any time without other services even noticing. So there certainly are some tradeoffs to data separation.

Amazon automatically scales the number of instances to ensure optimal performance and cost. That also means lack of steady flow of requests will cause a function to have zero instances running. Causing next request to incur some spin up time. Even though one second may not sound as much, services that depend on others have a cascading effect.

Additionally, our tests showed that when using VPC, ENI attachment time is in the range of a few seconds. All of that combined caused the aforementioned user-facing request to take 14s when cold started. When normally, while all services are running it takes just few hundred milliseconds to execute.

Although there is no option to enforce that functions don’t freeze, it should be possible to avoid freezing by calling the function every few minutes, thereby ensuring there is an active instance. CloudWatch Scheduled Events are a simple way to achieve this.

Here is a great article with more details about container lifetime.

4. Deployment tools

Most popular tools include Serverless and Apex which mainly attempt to simplify deployment.

There are also language-specific ones such as Claudiajs which is focused only on Node but offers some framework-like features. One worth mentioning is claudia-api-builder which automatically generates API Gateway routes for each service and enables you to write application similar to how you would using Express in a regular Node.js stack. There is a similar Python-based tool by AWS Labs — chalice.

Although all of those tools can help kickstart implementation of web services, there is still much work left to be done manually, especially in more complex scenarios.

Instance configuration requires redeployment

While configuring VPC we had an issue where despite everything seeming to be fine, instances still couldn’t access the database. It turned out that the Lambda configuration changes requires a redeployment in order to take effect.

5. Challenges of Continuous Integration

With microservices, the only viable way to continuously deliver new versions of the application is by using automated tools to deploy services. However, there are some gotchas when using Lambda with CI tools.

Building on EC2

Most automated deployment tools bundle Node projects by installing dependencies locally, archiving them and uploading them to Lambda. That works for most packages, but some (like image resizing libraries or some parsers) have native implementations that are compiled while the package is being installed. So packages built on the development machine produce binaries that are incompatible with Lambda’s Amazon Linux.

To overcome the problem, you should always build packages on an EC2 instance running the same version of Amazon Linux as Lambda.

If you compile your own binaries, ensure that they’re either statically linked or built for the matching version of Amazon Linux. The current version of Amazon Linux in use within AWS Lambda can always be found on the Supported Versions page of the Lambda docs. — https://aws.amazon.com/blogs/compute/running-executables-in-aws-lambda/

Automatic migrations

Since the database is only available within the VPC, the easiest way to run automatic migrations upon deployment is to have the CI server located inside the VPC as well.

How we ended up doing it?

Although CircleCI is Axilis’ preferred CI tool, we weren’t able to use it with AWS Lambda due to the aforementioned issues.

One solution to both problems is to use an EC2 instance running a self-hosted version of a CI tool. We finally settled on Jenkins.

So… should I use Lambda?

Yes! Just make sure it fits the problem you are trying to solve.

There are challenges you will need to face, but each platform of course has its own. Overall, Lambda and serverless infrastructure in general, offer a lot of advantages that will pay off if you need to build a large, scalable system.

We do our best to share our knowledge with you. If you found this post helpful, please hit that little heart.

Axilis is a premium software design and development company with offices in New York and Croatia.

--

--