2

I would appreciate help from anyone familiar with how DynamoDB work. I need to perform scan on a large DynamoDB table. I know that DynamoDBClient scan operation is limited to 1 MB size of returned data. Does the same restriction apply to Table.scan operation? The thing is that Table.scan operation returns output of type "ItemCollection<ScanOutcome>", while DynamoDBClient scan returns ScanResult output and it is not clear to me whether these operations work in a similar way or not.

I have checked this example: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ScanJavaDocumentAPI.html, but it doesn't contain any hints about using last returned key.

My questions are: Do I still need to make scan calls in a cycle until lastreturnedkey is null if I use Table.scan? If yes, how do I get last key? If not, how can I enforce pagination? Any links to code examples would be appreciated. I have spent some time googling for examples, but most of them are either using DynamoDBClient or DynamoDBMapper, while I need to use Table and Index objects instead.

Thanks!

8
  • You said yo have a very large table, but you are looking for something in particular (or a set), so you can start filtering your result (which is obvious I guess). If the same is not big enough: yes, you have to keep searching in the next batch(es).
    – user4695271
    Commented Sep 6, 2016 at 10:55
  • I am not sure I understood your comment. I do have a filterexpression that filters out my scan results, but that doesn't guarantee that my results will never exceed 1Mb Commented Sep 6, 2016 at 10:59
  • So, you need to scan the next batch; you can do it in parallel by "playing" with Segments and/or TotalSegments; in that case the value of LastEvaluatedKey returned from the request must be used as the ExclusiveStartKey with the same segment ID in a subsequent scan operation. It's pretty much like SQL, but faster!
    – user4695271
    Commented Sep 6, 2016 at 11:06
  • There is no "LastEvaluatedKey" parameter in Table.scan output type Commented Sep 6, 2016 at 11:08
  • 1
    why would not pages() work for you docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/…
    – kuhajeyan
    Commented Sep 6, 2016 at 12:52

1 Answer 1

1

If you iterate over the output of Table.scan(), the SDK will do pagination for you.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.