做标签的网站,深圳建企业网站公司,竞馨门户网站开发,wordpress 前端用户一、tf2中常用的损失函数介绍
在 TensorFlow 2 中#xff0c;编译模型时可以选择不同的损失函数来定义模型的目标函数。不同的损失函数适用于不同的问题类型和模型架构。下面是几种常见的损失函数以及它们的作用和适用场景#xff1a;
1.均方误差#xff08;Mean Squared …一、tf2中常用的损失函数介绍
在 TensorFlow 2 中编译模型时可以选择不同的损失函数来定义模型的目标函数。不同的损失函数适用于不同的问题类型和模型架构。下面是几种常见的损失函数以及它们的作用和适用场景
1.均方误差Mean Squared Error, MSEMSE 是回归问题中常用的损失函数用于衡量预测值与真实值之间的平均平方差。较大的误差会得到更大的惩罚适用于回归任务。
model.compile(lossmse, ...)
2.二进制交叉熵Binary Cross Entropy二进制交叉熵是二分类问题中常用的损失函数用于衡量两个类别之间的差异性。适用于二分类问题输出为一个概率值的 sigmoid 激活的模型。
model.compile(lossbinary_crossentropy, ...)
3.多类交叉熵Categorical Cross Entropy多类交叉熵是多分类问题中常用的损失函数用于衡量多个类别之间的差异性。适用于多分类问题输出为每个类别的概率分布的 softmax 激活的模型。
model.compile(losscategorical_crossentropy, ...)
4.稀疏分类交叉熵Sparse Categorical Cross Entropy类似于多类交叉熵但适用于标签以整数形式表示的多分类问题而不是 one-hot 编码。
model.compile(losssparse_categorical_crossentropy, ...)
5.KL 散度损失Kullback-Leibler DivergenceKL 散度用于衡量两个概率分布的差异性。在生成模型中常与自动编码器等模型结合使用促使模型输出接近于预定义的概率分布。
model.compile(losskullback_leibler_divergence, ...)
除了上述常见的损失函数之外还有其他一些定制化的损失函数可以根据具体任务和需求来自定义。通过 tf.keras.losses 模块您可以查看更多可用的损失函数并选择适合自己模型的损失函数。在选择损失函数时需要根据任务类型、数据分布以及模型设计进行合理选择以获得最佳的训练效果。
二、两种损失函数的比较分析
多类交叉熵Categorical Cross Entropy和稀疏分类交叉熵Sparse Categorical Cross Entropy
相同点都可用于数据多分类任务。
不同点对数据的输入要求不一样多类交叉熵Categorical Cross Entropy要求数据为one-hot 编码这个主要是针对数据的标签数据比如我们的数据标签数据读取的时候其类别是0-9这个数据可以是一列数据这个时候我们可以使用稀疏分类交叉熵Sparse Categorical Cross Entropy函数直接进行编译。
one_hot编码独热编码说明
一种将每个元素表示为二进制向量的编码方式其中只有一个元素为1其余元素都为0。例如如果我们有一个长度为N的列表那么它的one-hot编码将是一个NxN的矩阵其中第i行表示第i个元素的编码。例如如果我们有一个包含3种颜色的列表[红,蓝,绿]那么它们的one-hot编码将是
红[1,0,0] 蓝[0,1,0] 绿[0,0,1]
这种编码方式常用于机器学习中可以将每个类别标签转换为one-hot向量以便进行训练。
如果是使用多类交叉熵Categorical Cross Entropy作为损失函数那么我们对数据进行one-hot编码代码有的地方使用
y_traintf.keras.utils.to_categorical(y_train) #报错y_testtf.keras.utils.to_categorical(y_test)
在tensorflow2.5环境下报错
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [12,16] and labels shape [204][[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at /PycharmProjects/pythonProject/ML_New/MLP_Classifier_tf/MLP_Classifier_tf_imgVali.py:286) ]] [Op:__inference_train_function_762]Function call stack:
train_function
这里我们可以使用以下代码替代
y_train_one_hot tf.one_hot(y_train, depthnum_classes)
y_test_one_hot tf.one_hot(y_test, depthnum_classes)
三、示例代码分析
Sparse Categorical Cross Entropy和Categorical Cross Entropy对应的损失函数围为
losssparse_categorical_crossentropy losscategorical_crossentropy
使用minist数据做一个简单的MLP模型分类这里先使用Sparse Categorical Cross Entropy损失函数。代码如下
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense# 准备数据集
(x_train, y_train), (x_test, y_test) tf.keras.datasets.mnist.load_data()
x_train x_train.reshape(-1, 784) / 255.0
x_test x_test.reshape(-1, 784) / 255.0# 构建模型
model Sequential()
model.add(Dense(64, activationrelu, input_dim784))
model.add(Dense(64, activationrelu))
model.add(Dense(10, activationsoftmax))# 编译模型
model.compile(optimizeradam,losssparse_categorical_crossentropy,metrics[accuracy])# 训练模型
model.fit(x_train, y_train, epochs10, batch_size32, validation_data(x_test, y_test))# 评估模型
loss, accuracy model.evaluate(x_test, y_test)
print(Test Loss:, loss)
print(Test Accuracy:, accuracy)# 使用模型进行预测
predictions model.predict(x_test[:5])
print(Predictions:, tf.argmax(predictions, axis1))
print(Labels:, y_test[:5])运行结果如下
D:\PycharmProjects\pythonProject\venv\Scripts\python.exe D:/PycharmProjects/pythonProject/ML_New/MLP_Classifier_tf/MLP_TEST_MINIST.py
2023-10-14 22:28:27.465600: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2023-10-14 22:28:30.610122: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2023-10-14 22:28:30.637119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 computeCapability: 7.5
coreClock: 1.62GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2023-10-14 22:28:30.637445: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2023-10-14 22:28:30.648571: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2023-10-14 22:28:30.648748: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
2023-10-14 22:28:30.652682: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll
2023-10-14 22:28:30.654729: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll
2023-10-14 22:28:30.657643: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusolver64_11.dll
2023-10-14 22:28:30.661178: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusparse64_11.dll
2023-10-14 22:28:30.662311: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2023-10-14 22:28:30.662510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2023-10-14 22:28:30.662864: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-14 22:28:30.663583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 computeCapability: 7.5
coreClock: 1.62GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2023-10-14 22:28:30.663941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2023-10-14 22:28:31.130464: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2023-10-14 22:28:31.130645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2023-10-14 22:28:31.130748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2023-10-14 22:28:31.130967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6001 MB memory) - physical GPU (device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5)
2023-10-14 22:28:31.709522: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/10
2023-10-14 22:28:31.920032: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2023-10-14 22:28:32.369951: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
1875/1875 [] - 5s 2ms/step - loss: 0.2845 - accuracy: 0.9174 - val_loss: 0.1443 - val_accuracy: 0.9547
Epoch 2/10
1875/1875 [] - 5s 3ms/step - loss: 0.1261 - accuracy: 0.9633 - val_loss: 0.1085 - val_accuracy: 0.9646
Epoch 3/10
1875/1875 [] - 5s 3ms/step - loss: 0.0937 - accuracy: 0.9716 - val_loss: 0.1034 - val_accuracy: 0.9690
Epoch 4/10
1875/1875 [] - 5s 3ms/step - loss: 0.0731 - accuracy: 0.9772 - val_loss: 0.0987 - val_accuracy: 0.9714
Epoch 5/10
1875/1875 [] - 5s 3ms/step - loss: 0.0612 - accuracy: 0.9810 - val_loss: 0.0828 - val_accuracy: 0.9749
Epoch 6/10
1875/1875 [] - 5s 3ms/step - loss: 0.0507 - accuracy: 0.9835 - val_loss: 0.0955 - val_accuracy: 0.9702
Epoch 7/10
1875/1875 [] - 5s 3ms/step - loss: 0.0430 - accuracy: 0.9859 - val_loss: 0.0863 - val_accuracy: 0.9746
Epoch 8/10
1875/1875 [] - 5s 3ms/step - loss: 0.0374 - accuracy: 0.9874 - val_loss: 0.0935 - val_accuracy: 0.9737
Epoch 9/10
1875/1875 [] - 6s 3ms/step - loss: 0.0328 - accuracy: 0.9894 - val_loss: 0.0902 - val_accuracy: 0.9754
Epoch 10/10
1875/1875 [] - 6s 3ms/step - loss: 0.0287 - accuracy: 0.9900 - val_loss: 0.0902 - val_accuracy: 0.9771
313/313 [] - 1s 2ms/step - loss: 0.0902 - accuracy: 0.9771
Test Loss: 0.09022707492113113
Test Accuracy: 0.9771000146865845
Predictions: tf.Tensor([7 2 1 0 4], shape(5,), dtypeint64)
Labels: [7 2 1 0 4]Process finished with exit code 0我们将损失函数修改为Categorical Cross Entropy运行代码就会报错 ValueError: Shapes (32, 1) and (32, 10) are incompatible
这是因为我们没有将标签数据转化为独热编码我们转换一下,,在model.fit()函数前加上
y_train tf.one_hot(y_train, depth10)
y_test tf.one_hot(y_test, depth10)
运行结果如下
D:\PycharmProjects\pythonProject\venv\Scripts\python.exe D:/PycharmProjects/pythonProject/ML_New/MLP_Classifier_tf/MLP_TEST_MINIST.py
2023-10-14 23:20:04.708405: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2023-10-14 23:20:07.803493: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2023-10-14 23:20:07.833164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 computeCapability: 7.5
coreClock: 1.62GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2023-10-14 23:20:07.833480: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2023-10-14 23:20:07.840527: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2023-10-14 23:20:07.840689: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
2023-10-14 23:20:07.844132: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll
2023-10-14 23:20:07.845657: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll
2023-10-14 23:20:07.848488: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusolver64_11.dll
2023-10-14 23:20:07.852061: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusparse64_11.dll
2023-10-14 23:20:07.853130: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2023-10-14 23:20:07.853317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2023-10-14 23:20:07.853652: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-14 23:20:07.854467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 computeCapability: 7.5
coreClock: 1.62GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2023-10-14 23:20:07.854879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2023-10-14 23:20:08.326771: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2023-10-14 23:20:08.326942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2023-10-14 23:20:08.327041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2023-10-14 23:20:08.327252: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6001 MB memory) - physical GPU (device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5)
2023-10-14 23:20:08.914697: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/10
2023-10-14 23:20:09.138669: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll1/1875 [..............................] - ETA: 21:57 - loss: 2.4066 - accuracy: 0.12502023-10-14 23:20:09.626287: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
1875/1875 [] - 6s 3ms/step - loss: 0.2784 - accuracy: 0.9182 - val_loss: 0.1517 - val_accuracy: 0.9510
Epoch 2/10
1875/1875 [] - 5s 3ms/step - loss: 0.1217 - accuracy: 0.9633 - val_loss: 0.1258 - val_accuracy: 0.9611
Epoch 3/10
1875/1875 [] - 5s 3ms/step - loss: 0.0874 - accuracy: 0.9731 - val_loss: 0.1045 - val_accuracy: 0.9666
Epoch 4/10
1875/1875 [] - 5s 3ms/step - loss: 0.0701 - accuracy: 0.9778 - val_loss: 0.0929 - val_accuracy: 0.9718
Epoch 5/10
1875/1875 [] - 5s 3ms/step - loss: 0.0569 - accuracy: 0.9821 - val_loss: 0.0853 - val_accuracy: 0.9751
Epoch 6/10
1875/1875 [] - 5s 3ms/step - loss: 0.0488 - accuracy: 0.9844 - val_loss: 0.0911 - val_accuracy: 0.9706
Epoch 7/10
1875/1875 [] - 5s 3ms/step - loss: 0.0402 - accuracy: 0.9868 - val_loss: 0.0847 - val_accuracy: 0.9748
Epoch 8/10
1875/1875 [] - 5s 3ms/step - loss: 0.0355 - accuracy: 0.9882 - val_loss: 0.0975 - val_accuracy: 0.9723
Epoch 9/10
1875/1875 [] - 5s 3ms/step - loss: 0.0307 - accuracy: 0.9895 - val_loss: 0.1027 - val_accuracy: 0.9743
Epoch 10/10
1875/1875 [] - 5s 3ms/step - loss: 0.0281 - accuracy: 0.9907 - val_loss: 0.1004 - val_accuracy: 0.9734
313/313 [] - 1s 2ms/step - loss: 0.1004 - accuracy: 0.9734
Test Loss: 0.10037881135940552
Test Accuracy: 0.9733999967575073
Predictions: tf.Tensor([7 2 1 0 4], shape(5,), dtypeint64)
Labels: tf.Tensor(
[[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.][0. 0. 1. 0. 0. 0. 0. 0. 0. 0.][0. 1. 0. 0. 0. 0. 0. 0. 0. 0.][1. 0. 0. 0. 0. 0. 0. 0. 0. 0.][0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]], shape(5, 10), dtypefloat32)Process finished with exit code 0注意事项
使用稀疏交叉熵损失函数编译的模型的预测结果标签从0开始。如果自己的数据是从1开始的那么后面做验证分析的时候需要注意两者应该保持一致。
在使用稀疏交叉熵损失函数进行多分类问题训练时标签通常使用整数表示并且标签值的范围是从0到类别数量减1。模型的输出也应该是每个类别的概率分布。
例如如果有3个类别标签将被编码为0、1和2并且模型的输出将是一个长度为3的概率分布向量表示对每个类别的预测概率。
在预测时模型会返回对每个类别的预测概率通过取最大概率对应的索引就可以得到预测的类别。这个索引范围是从0到类别数量减1。与稀疏交叉熵不同使用普通的非稀疏交叉熵损失函数时标签通常使用 one-hot 编码其中每个类别都由一个向量表示只有真实标签对应的位置为1其余都为0。在这种情况下预测结果的标签也是从0开始的。