Tensorflow cannot get `image.shape` from method in `dataset.map( mapFn )`

Question

I'm trying to do the tensorflow equivalent of torch.transforms.Resize(TRAIN_IMAGE_SIZE), which resizes the smallest image dimension to TRAIN_IMAGE_SIZE. Something like this

def transforms(filename):
  parts = tf.strings.split(filename, '/')
  label = parts[-2]

  image = tf.io.read_file(filename)
  image = tf.image.decode_jpeg(image)
  image = tf.image.convert_image_dtype(image, tf.float32)

  # this doesn't work with Dataset.map() because image.shape=(None,None,3) from Dataset.map()
  image = largest_sq_crop(image) 

  image = tf.image.resize(image, (256,256))
  return image, label

list_ds = tf.data.Dataset.list_files('{}/*/*'.format(DATASET_PATH))
images_ds = list_ds.map(transforms).batch(4)

The simple answer is here: Tensorflow: Crop largest central square region of image

But when I use the method with tf.data.Dataset.map(transforms), I get shape=(None,None,3) from inside largest_sq_crop(image). The method works fine when I call it normally.

I believe the problem has to do with the fact that EagerTensors are not available within Dataset.map() so the shape is unknown. is there a workaround? — michael
– michael, Commented Feb 19, 2020 at 3:18

michael · Accepted Answer · 2020-02-27 07:25:43Z

I found the answer. It had to do with the fact that my resize method worked fine with eager execution, e.g. tf.executing_eagerly()==True but failed when used within dataset.map(). Apparently, in that execution environment, tf.executing_eagerly()==False.

My error was in the way I was unpacking the shape of the image to get dimensions for scaling. Tensorflow graph execution does not seem to support access to the tensor.shape tuple.

  # wrong
  b,h,w,c = img.shape
  print("ERR> ", h,w,c)
  # ERR>  None None 3

  # also wrong
  b = img.shape[0]
  h = img.shape[1]
  w = img.shape[2]
  c = img.shape[3]
  print("ERR> ", h,w,c)
  # ERR>  None None 3

  # but this works!!!
  shape = tf.shape(img)
  b = shape[0]
  h = shape[1]
  w = shape[2]
  c = shape[3]
  img = tf.reshape( img, (-1,h,w,c))
  print("OK> ", h,w,c)
  # OK>  Tensor("strided_slice_2:0", shape=(), dtype=int32) Tensor("strided_slice_3:0", shape=(), dtype=int32) Tensor("strided_slice_4:0", shape=(), dtype=int32)

I was using shape dimensions downstream in my dataset.map() function and it threw the following exception because it was getting None instead of a value.

TypeError: Failed to convert object of type <class 'tuple'> to Tensor. Contents: (-1, None, None, 3). Consider casting elements to a supported type.

When I switched to manually unpacking the shape from tf.shape(), everything worked fine.

Think this only works with a batch size of 1? ValueError: slice index 3 of dimension 0 out of bounds. for '{{node strided_slice_3}} = StridedSlice[Index=DT_INT32, T=DT_INT32, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1](Shape, strided_slice_3/stack, strided_slice_3/stack_1, strided_slice_3/stack_2)' with input shapes: [3], [1], [1], [1] and with computed input tensors: input[1] = <3>, input[2] = <4>, input[3] = <1>

Collectives™ on Stack Overflow

Tensorflow cannot get `image.shape` from method in `dataset.map( mapFn )`

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related