3

I am experiencing a very strange issue with Keras ImageDataGenerator.

I have a landscape image with shape (1056, 2000, 3).

  • Height (Axis 0): 1056

  • Width (Axis 1): 2000

When I use plt.imshow(), the axes are displayed correctly: the y-axis goes up to 1056 and the x-axis goes up to 2000. My Keras image_data_format is set to "channels_last".

The Problem: When I set width_shift_range=0.4, the image shifts vertically (up and down) instead of horizontally. Conversely, when I set height_shift_range=0.4, the image shifts horizontally (left and right).

It seems like Keras is incorrectly mapping width to Axis 0 and height to Axis 1, even though the input array follows the standard (H, W, C) format and imshow renders it correctly.

Here is my code snippet:

import cv2
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Image shape is (1056, 2000, 3)
img = cv2.imread('my_image.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

def show_aug_image(img, generator, n_img=4):
    img_batch = np.expand_dims(img, axis=0)
    
    generator.fit(img_batch)
    gen_iter = generator.flow(img_batch)
    
    fig, axes = plt.subplots(1, n_img, figsize=(24, 8))
    
    for i in range(n_img):
        aug_batch = next(gen_iter)
        print(aug_batch.shape)
        aug = np.squeeze(aug_batch)
        print(aug.shape)
        aug = aug.astype('int')
        axes[i].imshow(aug)
        
    print(img_batch.shape)

# This shifts the image UP/DOWN (Vertically)
gen = ImageDataGenerator(width_shift_range=0.4, fill_mode='constant', cval=0)
show_aug_image(img, gen)

# This shifts the image LEFT/RIGHT (Horizontally)
# gen = ImageDataGenerator(height_shift_range=0.4, fill_mode='constant', cval=0)
# show_aug_image(img, gen)
(1, 1056, 2000, 3)

(1056, 2000, 3)

(1, 1056, 2000, 3)

(1056, 2000, 3)

(1, 1056, 2000, 3)

(1056, 2000, 3)

(1, 1056, 2000, 3)

(1056, 2000, 3)

(1, 1056, 2000, 3)

enter image description here

Environment:

tensorflow version : 2.20.0
keras version : 3.13.2
python version : 3.12.9
OS : window 11  

I have already checked tf.keras.backend.image_data_format() and it is confirmed as 'channels_last'. Why is ImageDataGenerator swapping the behavior of width and height shifts? Is this a known bug for wide aspect ratio images?

1
  • Can you try a fixed pixel shift instead of 0.4? 0.4 is treated as a fraction of the dimension, so on a 2000px width that’s up to ~800px random shift. Try a fixed pixel shift instead, such as width_shift_range=[100], to remove randomness and see the direction more clearly. Commented Feb 14 at 12:23

1 Answer 1

0

This is a known bug in the deprecated ImageDataGenerator that accidentally swaps the x and y axes in its internal transformation matrix. You can fix this by abandoning the legacy generator entirely and use the RandomTranslation layer.

from tensorflow.keras.layers import RandomTranslation

translation_layer = RandomTranslation(width_factor=0.4, height_factor=0.0, fill_mode='constant')
aug_batch = translation_layer(img_batch, training=True)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.