• EC2: Generic Linux virtual machine. Easy to get started, but can be quickly expensive. Local storage is $0.12 per Gb month.
  • S3: $0.023 per Gb month (5 times cheaper than EC2) and can be mounted on EC2 via s3fs-fuse (https://github.com/s3fs-fuse/s3fs-fuse)
  • Lambda: Cheap compute node. Free for 1M queries per month and $0.20 for next 1M queries. Supports Python and complex dependency can be met via SAM (Serverless Application Model). Can execute binary programs with some effort (discussed in this page). The code (or a zip package containing code and binaries) is hosted in S3, and it creates a container with 500Mb temporary storage upon invocation. It takes some time for the container to be created to be ready to execute, but is quick next time the code gets executed.  Lambda can not mount S3, and the data file in S3 must be downloaded (https://dluo.me/s3databoto3


How to get started.

This excellent tutorial mostly works, but it doesn't go beyond running "hello world". 
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-quick-start.html


I had some trouble with the tutorial and it was related to my setup. You need config and credential files under .aws. 

(py3env) [ec2-user@ip-172-31-14-155 sam-app]$ cat ~/.aws/config
[default]
region = ap-southeast-2

I created credential following this tutorial while it was not directly useful for what I was trying to achieve. (Serverless is alternative to SAM)  https://serverless.com/framework/docs/providers/aws/guide/credentials/ )


You need to make sure AttachRolePolicy and DetachRolePolicy are added to IAM for the user (whose credential you keep under .aws)


In this tutorial, I want to try a few things.

  1. Executing another python code "hello.py"
  2. Executing an external binary program "ls"



Executing external python code

Let's have the lambda_handler to execute hello.py first.

def main():
   return "hello from lambda"

This should be in the same directory as app.py

Now, let's modify app.py


(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ cat app.py

import json

import hello


def lambda_handler(event, context):

    out = hello.main()
    return {
        "statusCode": 200,
        "body": json.dumps({
            "message": out,
        }),
    }

Place hello.py in the same directory as app.py.


(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ ls
app.py  hello.py  __init__.py  __pycache__  requirements.txt


Make a zip package.

(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ zip -r ../package.zip .
  adding: __pycache__/ (stored 0%)
  adding: __pycache__/__init__.cpython-37.pyc (deflated 25%)
  adding: __pycache__/app.cpython-37.pyc (deflated 48%)
  adding: __pycache__/hello.cpython-37.pyc (deflated 31%)
  adding: __init__.py (stored 0%)
  adding: requirements.txt (stored 0%)
  adding: hello.py (stored 0%)
  adding: app.py (deflated 52%)


Now, move this to the directory where s3 bucket is mounted. 

(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ mv ../package.zip ~/s3_quakecore/


Note

I have the following entry in /etc/rc.local

/usr/bin/s3fs quakecore -o use_cache=/tmp -o allow_other -o uid=1000 \\
-o mp_umask=002 -o multireq_max=5 /home/ec2-user/s3_quakecore


Notice that the file is now stored in S3 bucket.


We will find its object url.


Go to the Lambda function page. Select "Upload a file from Amazon S3" and enter object url and click "Save" at the top corner.


Find the decompressed files from the zip file.




Possibly, using sam package command, all these steps are unnecessary. The following official documentation has some instruction, not yet tested.

https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-deploying.html



Let's find the API end point.


You can either enter this url in the web browser, or execute using curl.

https://es6pawpb9f.execute-api.ap-southeast-2.amazonaws.com/Prod/hello


(py3env) [ec2-user@ip-172-31-14-155 qcore]$ curl
 https://es6pawpb9f.execute-api.ap-southeast-2.amazonaws.com/Prod/hello


{"message": "hello from lambda"}
{"message": "app.py\nhello.py\n__init__.py\nls\n__pycache__\nrequirements.txt\n"}


Possibly you might have to deploy first.


(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ cd ..
(py3env) [ec2-user@ip-172-31-14-155 sam-app]$ sam deploy --template-file packaged.yaml\\
 --stack-name sam-app --capabilities CAPABILITY_IAM
Waiting for changeset to be created..
No changes to deploy. Stack sam-app is up to date


Executing a binary from Lambda

If you wish to execute a binary from lambda, the binary should be statically linked or built in Amazon Linux. I'm using "ls" as a simple example.

Copy "ls" to the same directory as app.py

(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ ls
app.py  hello.py  __init__.py  ls  __pycache__  requirements.txt


Let's edit app.py.  When the container is created, all the code are extracted in /var/tasks. Interestingly, you are not allowed to run a binary code from this directory (read-only file system). You have to copy the program to /tmp and give +x permission. 


(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ cat app.py
from subprocess import Popen,PIPE
import sys
import os
import shutil

def lambda_handler(event, context):

    LAMBDA_TASK_ROOT = os.environ.get('LAMBDA_TASK_ROOT', os.path.dirname(os.path.abspath(__file__)))
    #copy to /tmp has LAMBDA_TASK_ROOT is read-only file system.
    EXE_LS='/tmp/ls'
    shutil.copyfile(os.path.join(LAMBDA_TASK_ROOT,'ls'),EXE_LS)
    os.chmod(EXE_LS,0o755)
    p = Popen(EXE_LS, shell=True, stdout=PIPE, stderr=PIPE)
    out, err = p.communicate(None)
    out = out.decode("utf-8") #convert bytes to str
    
    return {
        "statusCode": 200,
        "body": json.dumps({
            "message": out,
        }),
    }


Make a zip package, move to S3 bucket, and upload the zip file as before.

(py3env) [ec2-user@ip-172-31-14-155 hello_world]$ zip -r ../package.zip .
  adding: __pycache__/ (stored 0%)
  adding: __pycache__/__init__.cpython-37.pyc (deflated 25%)
  adding: __pycache__/app.cpython-37.pyc (deflated 48%)
  adding: __pycache__/hello.cpython-37.pyc (deflated 31%)
  adding: __init__.py (stored 0%)
  adding: requirements.txt (stored 0%)
  adding: hello.py (stored 0%)
  adding: ls (deflated 51%)
  adding: app.py (deflated 52%)


Lambda only gives 500Mb storage /tmp. If your program needs more space, Lambda is not ideal.


Now execute this.

(py3env) [ec2-user@ip-172-31-14-155 qcore]$ curl
 https://es6pawpb9f.execute-api.ap-southeast-2.amazonaws.com/Prod/hello


{"message": "app.py\nhello.py\n__init__.py\nls\n__pycache__\nrequirements.txt\n"}
  • No labels