Skip to main content
added 17 characters in body
Source Link
Morten Grum
  • 1.1k
  • 1
  • 12
  • 26

If you don't have accessaccess to the code that created the model or if you just don't want your prediction/validation code to dependdepend on your model creation and training code there is another waythere is another way:

You could create a new model from a modified version of the loaded model's config like this:

loaded_model = tf.keras.models.load_model('model_file.h5')
config = loaded_model.get_config()
old_batch_input_shape = config['layers'][0]['config']['batch_input_shape']
config['layers'][0]['config']['batch_input_shape'] = (new_batch_size, old_batch_input_shape[1])
new_model = loaded_model.__class__.from_config(config)
new_model.set_weights(loaded_model.get_weights())

This works well for me in a situation where I have several different models with state-full RNN layers working together in a graph network but being trained separately with different networks leading to different batch sizes. It allows me to experiment with the model structures and training batches without needing to change anything in my validation script.

If you don't have access to the code that created the model or you just don't want your prediction/validation code to depend on your model creation and training code there is another way:

You could create a new model from a modified version of the loaded model's config like this:

loaded_model = tf.keras.models.load_model('model_file.h5')
config = loaded_model.get_config()
old_batch_input_shape = config['layers'][0]['config']['batch_input_shape']
config['layers'][0]['config']['batch_input_shape'] = (new_batch_size, old_batch_input_shape[1])
new_model = loaded_model.__class__.from_config(config)
new_model.set_weights(loaded_model.get_weights())

This works well for me in a situation where I have several different models with state-full RNN layers working together in a graph network but being trained separately with different networks leading to different batch sizes. It allows me to experiment with the model structures and training batches without needing to change anything in my validation script.

If you don't have access to the code that created the model or if you just don't want your prediction/validation code to depend on your model creation and training code there is another way:

You could create a new model from a modified version of the loaded model's config like this:

loaded_model = tf.keras.models.load_model('model_file.h5')
config = loaded_model.get_config()
old_batch_input_shape = config['layers'][0]['config']['batch_input_shape']
config['layers'][0]['config']['batch_input_shape'] = (new_batch_size, old_batch_input_shape[1])
new_model = loaded_model.__class__.from_config(config)
new_model.set_weights(loaded_model.get_weights())

This works well for me in a situation where I have several different models with state-full RNN layers working together in a graph network but being trained separately with different networks leading to different batch sizes. It allows me to experiment with the model structures and training batches without needing to change anything in my validation script.

Source Link
Morten Grum
  • 1.1k
  • 1
  • 12
  • 26

If you don't have access to the code that created the model or you just don't want your prediction/validation code to depend on your model creation and training code there is another way:

You could create a new model from a modified version of the loaded model's config like this:

loaded_model = tf.keras.models.load_model('model_file.h5')
config = loaded_model.get_config()
old_batch_input_shape = config['layers'][0]['config']['batch_input_shape']
config['layers'][0]['config']['batch_input_shape'] = (new_batch_size, old_batch_input_shape[1])
new_model = loaded_model.__class__.from_config(config)
new_model.set_weights(loaded_model.get_weights())

This works well for me in a situation where I have several different models with state-full RNN layers working together in a graph network but being trained separately with different networks leading to different batch sizes. It allows me to experiment with the model structures and training batches without needing to change anything in my validation script.