Backend

For the backend we are going to use FastAPI+Celery+PyTorch and it'll be hosted by AWS Lambda, one sure issue in this is that AWS Lambda has a limit of 500MB uncompressed requirements file, so we are goint to use EFS (Amazon Elastic File System) that gives 5GB in free tier to host our dependencies, so in theory we'll be able to install PyTorch 1.7.0 latest, and all the good dependencies needed, and this can be shared among any Lambda that requires it.

Another benefit of this is that the remaining 500MB can be used to temporarily store the models, which will be sent to the user over the socket connection, if it doesn't work we might need to use S3 buckets.

[28-12-20]

[10-01-21]

  • What i had planned worked !
  • I was able to mount the python dependencies and successfully use PyTorch 1.7.0 with both by Lambdas
  • Key thing to note is that i wasn't able to work with Celery, and instead created my own Task processor using Lambda
  • So basically we push the dataset to S3 and an event is fired that starts Lambda and it trains the model, saves it in EFS and updates the task status in the DynamoDB

I Wrote a Blog on this https://satyajitghana.medium.com/working-with-large-dependencies-500mb-with-aws-lambda-and-efs-amazon-elastic-file-system-137509e03c1a