首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >损失:为共享单车构建模型时出现nan

损失:为共享单车构建模型时出现nan
EN

Stack Overflow用户
提问于 2019-03-03 15:38:47
回答 1查看 137关注 0票数 1

我是机器学习的新手,如果提问方式不是很好,问题又这么简单,请多包涵。

问题是我开发的模型作为nan正在退回损失,如果我做错了什么,请告诉我。以下是详细信息。

程序逻辑

代码语言:javascript
复制
import tensorflow as tf
import pandas as pd
# Reading the csv file from local drive as a dataframe
bike_df = pd.read_csv('C:\\Users\\HOME\\MLPythonPractice\\Data sets\\Bike-Sharing-Dataset\\day.csv')
bike_result_df = pd.read_csv('C:\\Users\\HOME\\MLPythonPractice\\Data sets\\Bike-Sharing-Dataset\\day.csv')

# Remove unwanted columns from the data frame
bike_df = bike_df.drop(columns=['instant','dteday','cnt'])
# shape of the dataframe
print(bike_df.shape)
# Exact attribute to see the columns of the dataframe
print(bike_df.columns)
# To know the type 
print(type(bike_df))
# To see the information of the dataframe
print(bike_df.info())
# Converting from dataframe to ndarray
bike_s = bike_df.values
print(type(bike_s))
print(bike_s.shape)
# Remove all the columns except cnt column which is result set
bike_result_df['cnt'] = bike_result_df['cnt'].values.astype(np.float64)  #converting to float
bike_result_df = bike_result_df['cnt']  # Removing all columns except cnt column
bike_result_s = bike_result_df.values   # Converting dataframe to ndarray
print(type(bike_result_s))
print(bike_result_s)
import numpy as np
print(type(bike_df))
print(bike_df.shape)
print(bike_result_df.shape)
#As the data frame is available, we will build the graph using keras (## are part of build graph)

## Initialise the sequential model
model = tf.keras.models.Sequential()
## Normalize the input data by creating a normalisation layer
model.add(tf.keras.layers.BatchNormalization(input_shape = (13,)))
## Add desnse layer for predition -- Keras declares weights and bias - dense(1) 1 here is expected value
model.add(tf.keras.layers.Dense(1))
# Compile the model - add loss and gradient descen optimiser
model.compile(optimizer='sgd',loss='mse')
print(type(bike_s))
print(type(bike_result_s))
print(bike_s.shape)
print(bike_result_s.shape)
print(bike_result_s)
# Execute the graph
model.fit(bike_s,bike_result_s,epochs=10)
model.save('models/bike_sharing_lr.h5')

我正在获取输出

代码语言:javascript
复制
Epoch 1/10
731/731 [==============================] - 1s 895us/step - loss: nan     
Epoch 2/10
731/731 [==============================] - 0s 44us/step - loss: nan
Epoch 3/10
731/731 [==============================] - 0s 46us/step - loss: nan
Epoch 4/10
731/731 [==============================] - 0s 44us/step - loss: nan
Epoch 5/10
731/731 [==============================] - 0s 39us/step - loss: nan
Epoch 6/10
731/731 [==============================] - 0s 39us/step - loss: nan
Epoch 7/10
731/731 [==============================] - 0s 47us/step - loss: nan
Epoch 8/10
731/731 [==============================] - 0s 40us/step - loss: nan
Epoch 9/10
731/731 [==============================] - 0s 43us/step - loss: nan
Epoch 10/10
731/731 [==============================] - 0s 42us/step - loss: nan
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-03-03 17:42:45

为了防止你的渐变爆炸,你可以像这样裁剪它。

代码语言:javascript
复制
model.compile(optimizer=tf.keras.optimizers.SGD(clipnorm=1), loss='mse')

根据https://keras.io/optimizers/的说法,设置clipnorm=1允许梯度下降优化器控制梯度裁剪。所有参数梯度将被修剪为最大范数1。这将防止损失函数发散。

有关控制分解渐变的其他方法,请参见https://www.dlology.com/blog/how-to-deal-with-vanishingexploding-gradients-in-keras/

通过上述调整,损失函数不会发散,但也不会随着时间的推移而减少。我注意到你设置模型的方式很奇怪。批处理标准化通常应遵循激活层。我不确定您为什么需要规范化您的输入,但是您不应该使用BatchNormalize。如果将模型更改为,

代码语言:javascript
复制
model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Dense(1, activation='relu'))

model.add(tf.keras.layers.BatchNormalization(input_shape = (13,)))

model.compile(optimizer='sgd', loss='mse')

您将得到一个更有意义的结果,损失函数值现在从大约20000万减少到120,000。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54966616

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档