0

I have a model with an input (batch of images w/ shape (height, width, time)) that has a dynamically sized dimension (time), which is only determined at runtime. However, the Dense layer requires fully defined spatial dimensions. Code snippet example:

import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Input

# Define an input with an undefined dimension (None)
input_tensor = Input(shape=(None, 256, 256, None, 13))

# Apply a Dense layer (which expects a fully defined shape)
x = Flatten()(input_tensor)
x = Dense(10)(x)

# Build the model
model = tf.keras.models.Model(inputs=input_tensor, outputs=x)

model.summary()

This raises the error:

ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None.

How can I make it work using Flatten instead of alternatives like GlobalAveragePooling3D? Essentially, I’m looking for a way to create a 1D array with the original pixel values, but compatible with the Dense layer.

10
  • how do you expect that to work? assuming the flatten layer would tolerate dynamic-sized input, its output would have a dynamic size too. what's going to "consume" that? Commented Dec 14, 2024 at 12:15
  • This has nothing to do with Flatten() but with the Dense layer after it, which needs a fixed input size. So you have to get rid of the dynamic size, plain and simple. Also, you are clearly getting rid of pixel-level information anyway by flattening and applying a Dense layer, so that doesn't seem so important after all...
    – xdurch0
    Commented Dec 14, 2024 at 18:11
  • @xdurch0 Thank you for clarifying! I've revised the question to make it clearer. Essentially, I’m looking for an alternative to Flatten() to create a 1D array with the original pixel values. Using GlobalAveragePooling3D() works, but does not preserves the original pixel values. However, since I'm working on a classification problem, it might be a good alternative.
    – lbrandao
    Commented Dec 17, 2024 at 10:57
  • I found this answer helpful- stats.stackexchange.com/questions/388859/… - even though it’s not exactly what I was looking for.
    – lbrandao
    Commented Dec 17, 2024 at 11:04
  • I don't really understand what you're trying to do, but if you want to use flatten and your input dimensions will be in this order, you can create the placeholder like this and sample your training data accordingly: batch x height x width x max_time x channel For input data: 1- create array of zeros with this shape. 2- create copies by filling array of zeros with input data by shifting-1 the time dimension Commented Dec 17, 2024 at 11:04

1 Answer 1

1

This is just not possible because a dense layer has a fixed number of weights. When you call a dense layer after flattening, it is effectively doing

w_0 * x_0 + w_1 * x_1 + w_2 * x_2 + .... + w_n-1 * x_n-1 + bias where the ws are the weights and the xs are the flattened input feature values.

So if due to your unknown dimension, n can't be known ahead of time, then it's just not possible for the network to be configured with the appropriate number of weights.

Even if you knew the "max time" and want to preallocate the number of weights in the network to support that, it would likely suffer from two problems

  1. the network gets too large
  2. the network overfits because it is treating each pixel in each time step as a completely separate feature which is going to bloat up the dimensionality with no real benefits

So the alternatives to capture the time axis would be to either make it a time-series network like LSTMs or recurrent neural networks, or a 3D convolution network which relies on pooling across time.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.