4

A keras model does learn using Numpy arrays as the input, but fails to make any progress if reading data from a tf.data pipeline. What could be the reason potentially?

In particular, the model consumes batched multidimensional time series (so every point is an N x M tensor) and solves a classification problem. If the data are prepared in advance, by aggregating the time series in a large Numpy array, then the model successfully learns as indicated by a significant increase in the accuracy. However, when exactly the same input data is prepared using tf.data pipeline, the accuracy remains at the baseline level.

I compared the two sets of data by writing to disk, and they are identical. Also the types match.

Tried disabling threading (IIUC) by setting

options.threading.private_threadpool_size = 1

and experimenting with a bunch of options.experimental_optimization options.

Could it be the case that the data are read in parallel from the tf.data dataset as opposed to being read sequentially from the Numpy array?

For completeness, here's the pipeline, where np_array contains "raw" data:

ds = tf.data.Dataset.from_tensor_slices(np_array.T)
y_ds = (
    ds
   .skip(T - 1)
   .map(lambda s: s[-1] - 1)
   .map(lambda y: to_categorical(y, 3))
)
X_ds = (
    ds
    .map(lambda s: s[:n_features])
    .window(T, shift=1, drop_remainder=True)
    .flat_map(lambda x: x.batch(T, drop_remainder=True))
    .map(lambda x: tf.expand_dims(x, -1))
)
Xy_ds = (
    tf.data.Dataset.zip(X_ds, y_ds)
    .batch(size_batch)
    .repeat(n_epochs * size_batch)
    .prefetch(tf.data.AUTOTUNE)
)

and how fit() is called (the steps_per_epoch value is correct)

model.fit(
    Xy_train,
    epochs=n_epochs,
    steps_per_epoch=199,
    verbose=2
)

1 Answer 1

0

The problem arose due to the implicit shuffling in fit() when working with Numpy array data, contrasting with the absence of such shuffling in the tf.data pipeline. Shuffling is suitable even in time series analysis since the data were pre-aggregated, thus preserving the internal temporal structure.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.