1

I plan to fetch a list of records from a web service which limits the number of requests that can be made within a certain time frame.

My idea was to setup a simple pipeline like this:

List of URLs -> Lambda Function to fetch JSON -> S3

The part I'm not sure about is how to feed the list of URLs in rate/time limited blocks, e.g take 5 URLs and spawn 5 lambda functions every second.

Ideally I'd like to start this by uploading/sending/queueing the list once and then just let it do it's thing on its own until it has processed the queue completely.

1 Answer 1

2

Splitting the problem in two parts.

  1. Trigger: Lambda supports a wide variety. Look for Using AWS Lambda to process AWS events in Lambda FAQs.

I personally would go with Dynamo DB. But S3 will come in a close second.

There might be other options using other streams like Kinesis, but these seem simpler by far.

  1. Throttling: You can set limits on number of lambda instances.

So e.g. if you go with DDB:

  • You'll dump all your URLs in to a table one row per URL.
  • This will create events, one per row.
  • Each event triggers one Lambda call.
  • Number of parallel Lambda executions/instances are limited by config.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.