Problem
I have the following problem:
I want to use pytorchs DataLoader (in a similar way like here) but my setup varies a bit:
In my datafolder I have images (lets call them image_total
of different street situations and I want to use cropped images (called image_crop_[idx]
around persons that are close enough to the camera. So it can happen that some images give me one or more cropped images while others give me zero images as they do not show any person or they are to far away.
As I have a lot of images I want to make the implementation as efficient as possible.
My hope is that it is possible to use something like this:
I want to load the image_total
and check if useful crops are in it. If so I extract the cropped images and get a list like [image_crop_0, image_crop_1, image_crop_2,...]
Now my question: Is this possible to be compatible with pytorchs DataLoader? The problem I see is that ````getitem```-method of my class would return zero to arbitrary instances. I want to use a constant batch-size for training.
Considerations
- maybe DataLoader supports this (and I did not find it)
- I have to work with a buffer or something similar
- the fallback would be to pre process the data, but this would not be the most efficient solution