- Entropy
The level of uncertainty. It must range between 0 and 1. 0 < entropy < 1 certain uncertain |
The greater the value of entropy, the greater the uncertainty for probability,
the smaller the value the less the uncertainty.
reference : towardsdatascience.com/cross-entropy-loss-function-f38c4ec8643e
- Cross-Entropy
How far it is from the actual expected value, is used when adjusting model weights during training. The smaller the loss the better the model.
For example, for a fair coin, there are two outcomes. The cross entropy between two discrete probability distributions is a metric that captures how similar the two distributions are.
- BinaryCrossentropy
Computes the cross-entropy loss between true labels and predicted labels.
Use this cross-entropy loss for binary (0 or 1) classification applications. The loss function requires the following inputs:
- y_true (true label): This is either 0 or 1.
- y_pred (predicted value): This is the model's prediction, i.e, a single floating-point value which either represents a logit, (i.e, value in [-inf, inf] when from_logits=True) or a probability (i.e, value in [0., 1.] when from_logits=False).
- In other words, from_logits=True means the loss function that the output values generated by the model are not normalized, the softmax function has not been applied on them to produce a probability distribution. Therefore, the output layer in this case does not have a softmax activation function:
keras.losses.BinaryCrossentropy(
from_logits=False,
label_smoothing=0.0,
axis=-1,
reduction="sum_over_batch_size",
name="binary_crossentropy",
)
- Binary cross entropy loss (or log loss)
It stores only one value, for example, if you were looking for the odds in a coin toss, it would store that information at 0.5 and 0.5 (heads and tails).
That means it would store only 0.5, with the other 0.5 assumed in a different problem. If the first probability was 0.7, it would assume the other was 0.3). It is used in scenarios where there are only two possible outcomes.
Let's walk through what happens for a particular data point. Let's say the correct indicator is i.e, .
In this case,
- SparseCategoricalCrossentropy
An extension of the categorical cross-entropy loss function that is used when the training data labels are represented in integer.
- Categorical cross-entropy is used when we have to deal with the labels that are one-hot encoded, for example, we have the following values for 3-class classification problem [1,0,0], [0,1,0] and [0,0,1].
- In sparse categorical cross-entropy , labels are integer encoded, for example, [0], [1] and [2] for 3-class problem.
- Categorical cross-entropy
The labels that are one-hot encoded : [0, 0, 1], [0, 1, 0]
categorical_loss = tf.keras.losses.CategoricalCrossentropy()
y_target_categorical = tf.convert_to_tensor([[0, 0, 1], [0, 1, 0]])
y_predction = tf.convert_to_tensor([[0, 0.1, 0.9], [0.1, 0.8, 0.2]])
categorical_loss(y_target_categorical, y_predction).numpy()
If we use integer encoded([2, 1]), it gives an error.
y_target_sparse = tf.convert_to_tensor([2, 1])
categorical_loss(y_target_sparse, y_predction).numpy()
>>>
InvalidArgumentError: Incompatible shapes: [2] vs. [2,3] [Op:Mul] name: categorical_crossentropy/mul/
- SparseCategoricalCrossentropy
tf.keras.losses.SparseCategoricalCrossentropy(
from_logits=False,
ignore_class=None,
reduction=losses_utils.ReductionV2.AUTO,
name='sparse_categorical_crossentropy')
In the case above,
The labels are integer encoded : [2, 1]
sparse_categorical_loss = tf.keras.losses.SparseCategoricalCrossentropy()
sparse_categorical_loss(y_target_sparse, y_predction).numpy()
https://jins-sw.tistory.com/16
- from_Logits = True
- If we didn’t use a SoftMax layer in the final layer, we should say from_logits=True when defining the Loss function.
- It is not using the softmax function separately and wants to include it in the calculation of the loss function.
- This means that whatever inputs you are providing to the loss function is not scaled (means inputs are just the number from -inf to +inf and not the probabilities).
- from_Logits = False(default)
- The softmax function would be automatically applied on the output values by the loss function.
'Machine Learning' 카테고리의 다른 글
Backpropagation, chain rule (0) | 2023.12.14 |
---|---|
Sigmoid, Softmax, Cross entropy (0) | 2023.12.05 |
Variance VS Bias, Bias&Variance Trade-off (0) | 2023.10.31 |
model.parameters (0) | 2023.01.17 |
Supervised vs Unsupervised Learning (0) | 2022.12.12 |