in AWS Lambda

Save API Results to PostgreSQL for Free with AWS Lambda

Overview

In this tutorial I will show you how to use Amazon Web Services (AWS) Lambda service to save the results of an API response to a PostgreSQL database on a recurring schedule.

The full repo is posted on GitHub here.

AWS Lambda is a serverless architecture, meaning you do not need to spin up and maintain resources before calling a program or function. You are only charged for the time needed to execute the function. It is also incredibly scalable since each event is an independent, one-off execution.

This use case was for a client of mine that needed to save the API data along side his other data in PostgreSQL. I know there are other ways to cache API results and that is not addressed here.

Resources

The application will be written in Python 3.6.

For an example API response I will use JSON Placeholder which produces fake API responses for posts, comments, etc. For PostgreSQL I created a free database on the ‘Tiny Turtle’ plan at ElephantSQL. For deployment I used the awesome AWS Lambda packaging tool Zappa.

API Response

Let’s assume we want to save the results from the API endpoint https://jsonplaceholder.typicode.com/posts which produces an array of items that look like this:

Let’s create an app.py file and use requests to retrieve our API response with a python function:

$ pip install requests

Database Model

I prefer SqlAlchemy when working with databases in python. So install that with:

$ pip install sqlalchemy

Let’s build our database connection and model in a separate models.py file. Our model will have four fields: id, userId, title, and body.

Saving to Database

Turning back to our app.py file, we create a main() function that imports a database session object, deletes any records that already exist, then saves a fresh batch after running our get_items() function.

Of note, if you have a lot of API fields and are naming your database fields exactly the same as the API fields, you can refactor the save_items() function using item.keys() and setattr:

Deploying to AWS Lambda

We now have an app that works on our local machine. Let’s deploy it to AWS using Zappa. Your first step is to install Zappa:

$ pip install zappa

Ensure you have an AWS user configured with programmatic access (API ID and secret key). Install the AWS CLI, then run:

$ aws configure

Follow the set up prompts to enter your user credentials. Unfortunately, Zappa requires borderline admin access to deploy an application. If you want to deploy with more minimal permissions, you can try the policy that is proposed as a pull request to Zappa here.

Zappa will create a JSON configuration file for your app by running:

$ zappa init

Go ahead and set the defaults, and use app.main for the modular path to your main function. After our zappa_settings.json file is created, we need to make some edits. Leave aws_region and s3_bucket alone as those are unique to your setup. Edit the rest to:

What we’ve done is add an events array, which will in turn create a CloudWatch event that triggers our app function every 10 minutes. I set keep_warm to false because this is not a web app. I do not need a fast response to a http request so the keep_warm setting is unnecessary. API Gateway resources are enabled by default, so I set that to false because we do not need it for this application.

Go ahead and deploy your app with:

$ zappa deploy dev

This command will:

  • Package your application and all dependencies into a zip file
  • Upload the zip file to a temporary S3 bucket
  • Create the necessary lambda function, permissions, and CloudWatch events
  • Install the package
  • Clean up and delete the local zip file and remote zip from S3

The last step is to go into our AWS Lambda function and set our database environment variable, with the Key as DATABASE_URL and the Value as a postgres URL format postgresql://user:[email protected]_path:port:/database_name.

You can see that our CloudWatch event is enabled by clicking the Triggers tab.

Monitoring Your Function

You can see if there are are any errors with your function by looking at your lambda function in AWS, then clicking Monitoring > View Logs in CloudWatch.  

One thing you can see in the logs is how much memory your function is using compared to what you allocated. The zappa default is 512. My log file initially looked like this:

REPORT RequestId:  Duration: 638.74 ms Billed Duration: 700 ms Memory Size: 512 MB Max Memory Used: 54 MB

So I am using only 54 MB of 512 MB. How do we correct this? You can manually change the setting in AWS Lambda, but the better way for long term maintenance is to edit your zappa_settings.json file and add this line, reducing it to the minimum of 128:

To update your function run:

$ zappa update dev

Remember to re-add your DATABASE_URL environment variable when this is finished as it will be removed after updating with Zappa.

You will notice I set the initial recurrence to 10 minutes. I do that to monitor the function for a bit, but would change that to something more reasonable later on.

Again, unless Amazon changes something with their lambda billing, this setup will run for free forever:

The Lambda free tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month. The memory size you choose for your Lambda functions determines how long they can run in the free tier. The Lambda free tier does not automatically expire at the end of your 12 month AWS Free Tier term, but is available to both existing and new AWS customers indefinitely.

Finishing Up

There is one final thing I want to address. Some of you are likely screaming at the screen, saying ‘you can save JSON directly into PostgreSQL!’ Yes, that’s right you can! How awesome is that? My client had a need to keep the data relational with foreign key relationships, etc. But if you want to save the JSON response directly into PostgreSQL you can change the model to this:

Then we can save the JSON in our save_items() function within app.py with this:

You can now query the JSON stored in the database directly.

I hope this tutorial was useful! Please provide feedback on anything I can improve in the comments below.

Write a Comment

Comment

Webmentions

  • Import Python: Import Python Weekly – debugging, machine learning, data science, testing ,docker ,locust and more | Adrian Tudor Web Designer and Programmer

    […] Save API Results to PostgreSQL for Free with AWS Lambda In this tutorial I will show you how to use Amazon Web Services (AWS) Lambda service to save the results of an API response to a PostgreSQL database on a recurring schedule. […]