What is the purpose of the add_loss function in Keras?

Currently I stumbled across variational autoencoders and tried to make them work on MNIST using keras. I found a tutorial on github.

My question concerns the following lines of code:

# Build modelvae = Model(x, x_decoded_mean)# Calculate custom lossxent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)vae_loss = K.mean(xent_loss + kl_loss)# Compilevae.add_loss(vae_loss)vae.compile(optimizer='rmsprop')

Why is add_loss used instead of specifying it as compile option? Something like vae.compile(optimizer='rmsprop', loss=vae_loss) does not seem to work and throws the following error:

ValueError: The model cannot be compiled because it has no loss to optimize.

What is the difference between this function and a custom loss function, that I can add as an argument for Model.fit()?

Thanks in advance!

P.S.: I know there are several issues concerning this on github, but most of them were open and uncommented. If this has been resolved already, please share the link!

Edit 1

I removed the line which adds the loss to the model and used the loss argument of the compile function. It looks like this now:

# Build modelvae = Model(x, x_decoded_mean)# Calculate custom lossxent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)vae_loss = K.mean(xent_loss + kl_loss)# Compilevae.compile(optimizer='rmsprop', loss=vae_loss)

This throws an TypeError:

TypeError: Using a 'tf.Tensor' as a Python 'bool' is not allowed. Use 'if t is not None:' instead of 'if t:' to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

Edit 2

Thanks to @MarioZ's efforts, I was able to figure out a workaround for this.

# Build modelvae = Model(x, x_decoded_mean)# Calculate custom loss in separate functiondef vae_loss(x, x_decoded_mean):    xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)    kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)    vae_loss = K.mean(xent_loss + kl_loss)    return vae_loss# Compilevae.compile(optimizer='rmsprop', loss=vae_loss)...vae.fit(x_train,     x_train,        # <-- did not need this previously    shuffle=True,    epochs=epochs,    batch_size=batch_size,    validation_data=(x_test, x_test))     # <-- worked with (x_test, None) before

For some strange reason, I had to explicitly specify y and y_test while fitting the model. Originally, I didn't need to do this. The produced samples seem reasonable to me.

Although I could resolve this, I still don't know what the differences and disadvantages of these two methods are (other than needing a different syntax). Can someone give me more insight?

Edit 1

Edit 2

Latest Images

Trending Articles

Latest Images