0

I am new to machine learning and I am using kaggle's notebook to code. I am making a classification model with multiple categories. I used efficientnet to make my model's architecture but the issue happens with every other model I've tried. The images to be classified are divided in train and val folders in the dataset. In those folders they are in their respective class's folder.

The code runs fine till the fit_generator, it gives me a valueError "ValueError: could not broadcast input array from shape (224,224,3) into shape (224,224,3,3)"

I have attached the full code, the dataset and an image of the error message.

I have no idea what is wrong in the code or the data? Please help me and thank you for reading this question and I apologize if there is any more context missing.

#!pip install -U efficientnet
import pandas as pd
import numpy as np
import efficientnet.tfkeras as efn  # Convolutional Neural Network architecture
import IPython.display as ipd
import librosa.display
import matplotlib.pyplot as plt
from efficientnet.keras import preprocess_input
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from sklearn.utils import class_weight
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator

import os

IM_SIZE = (224, 224, 3)
train=pd.read_csv("../input/birdclef-2022/train_metadata.csv")
BIRDS = os.listdir("../input/mel-split-mark17/mel_spectrogram/train")
BATCH_SIZE = 16
train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.1,
    fill_mode="nearest",
)
train_batches = train_datagen.flow_from_directory(
    "../input/mel-split-mark17/mel_spectrogram/train",
    classes=BIRDS,
    target_size=IM_SIZE,
    class_mode="categorical",
    shuffle=True,
    batch_size=BATCH_SIZE,
)

valid_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
valid_batches = valid_datagen.flow_from_directory(
    "../input/mel-split-mark17/mel_spectrogram/val",
    classes=BIRDS,
    target_size=IM_SIZE,
    class_mode="categorical",
    shuffle=False,
    batch_size=BATCH_SIZE,
)
# Define CNN's architecture
net = efn.EfficientNetB3(
    include_top=False, weights="imagenet", input_tensor=None, input_shape=IM_SIZE
)
x = net.output
x = Flatten()(x)
x = Dropout(0.5)(x)
output_layer = Dense(len(BIRDS), activation="softmax", name="softmax")(x)
net_final = Model(inputs=net.input, outputs=output_layer)
net_final.compile(
    optimizer=Adam(), loss="categorical_crossentropy", metrics=["accuracy"]
)

print(net_final.summary())

# Estimate class weights for unbalanced dataset
class_weights = class_weight.compute_class_weight(
    class_weight = "balanced",
    classes= np.unique(train_batches.classes),
    y=train_batches.classes
)

# Define callbacks
ModelCheck = ModelCheckpoint(
    "models/efficientnet_checkpoint.h5",
    monitor="val_loss",
    verbose=0,
    save_best_only=True,
    save_weights_only=True,
    mode="auto",
    period=1,
)

ReduceLR = ReduceLROnPlateau(monitor="val_loss", factor=0.2, patience=5, min_lr=3e-4)
# Train the model
net_final.fit_generator(
    train_batches,
    validation_data=valid_batches,
    epochs=30,
    steps_per_epoch=1596,
    class_weight=class_weights,
    callbacks=[ModelCheck, ReduceLR],
)

I get this error when I run the code

https://www.kaggle.com/datasets/bluetriad/mel-split-mark17

2
  • This might be working: in flow_from_directory, you shouldn't pass channel to target_size, meaning that target_size = (224, 224) Commented May 14, 2022 at 4:06
  • @MojtabaAbdiKh. I tried that but it gives a different error now ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Commented May 14, 2022 at 17:37

1 Answer 1

0

I think the problem is in the .flow_from_directory method. The shape pf the image in that method should not include the image channels and you can specify you are working with 3 channels by setting an additional parameter “color_mode” to “rgb”.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.