文章/答案/技术大牛

发布

社区首页 >问答首页 >我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练

问我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练
EN

Stack Overflow用户

提问于 2020-02-24 19:40:21

回答 1查看 372关注 0票数 1

我有一个基于图像的数据库，我正在与之合作，并试图将其转换为一个数字数组。然后我将使用它作为cGAN输入。我试过使用多个代码，但它们都给我带来了混乱的问题。不知道该怎么做

training_data = []
IMG_SIZE = 32
datadir = 'drive/My Drive/dummyDS'  
CATEGORIES = ['HTC-1-M7', 'IPhone-4s', 'iPhone-6', 'LG-Nexus-5x', 
              'Motorola-Droid-Max', 'Motorola-Nexus-6', 'Motorola-X', 
              'Samsung-Galaxy-Note3', 'Samsung-Galaxy-S4', 'Sony-Nex-7']

def create_training_data():
    i=0
    for category in CATEGORIES:
        path=os.path.join(datadir,category)
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
          img_array=cv2.imread(os.path.join(path,img))
          new_array=cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
          training_data.append([new_array,class_num])
          plt.imshow(img_array,cmap="gray")
          plt.imshow(new_array,cmap="gray")
          plt.show() 
create_training_data()

X=[]
y=[]
random.shuffle(training_data)

for features,label in training_data:
    X.append(features)
    y.append(label)

X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
pickle_out = open("X.pickle","wb")
pickle.dump(X, pickle_out)
pickle_out.close()

y = np.array(y)
pickle_out = open("y.pickle","wb")
pickle.dump(y, pickle_out)
pickle_out.close()

y = to_categorical(y)

# saving the y_labels_one_hot array as a .npy file
np.save('y_labels_one_hot.npy', y)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=2./11)

X_train.shape=(32,32,32,3)和y_train.shape= (32,4,2)

现在在训练中我得到了

real_labels=to_categorical(Y_train[i*batch_size:(i+1)*batch_size].reshape(-1,1),num_classes=10)
        d_loss_real = discriminator.train_on_batch(x=[X_batch, real_labels],
                                                   y=real * (1 - smooth))

ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(32, 32, 32, 3), (256, 10)]

python

arrays

numpy

tensorflow

回答 1

Stack Overflow用户

发布于 2020-05-08 23:03:52

tensorflow.keras.imagedatagenerator.flow_from_directory应该可以简化您的任务。

它以一种更简单的方式完成您使用您提到的代码所做的几乎所有事情，包括Splitting数据

上面提到的代码演示了如何使用它，以及每行代码的详细解释：

train_datagen = ImageDataGenerator(rescale=1./255, # Normalizes every pixel value
    validation_split=0.2) # Setting Validation Data as 20% of Total Data

train_generator = train_datagen.flow_from_directory(
    datadir, # Traverses through all the Sub Folders (Category) inside this dir
    target_size=(img_height, img_width), # Sets the Image Size
    batch_size=batch_size, # Generates batches of `batch_size`
    class_mode='categorical', # Will Consider Labels as Categorical
    shuffle = True, # Shuffles the Data
    subset='training') # Considers 80% as training data

# Since we don't have separate directory for Validation Data and since we want the Total Data to be Partitioned, we should use "train_datagen"
validation_generator = train_datagen.flow_from_directory(
    datadir , # Should use the Same Dir as Training for Splitting
    target_size=(img_height, img_width), 
    batch_size=batch_size,
    class_mode='categorical',
    shuffle = True, # Shuffles the Data
    subset='validation') # Considers 20% as Validation data

# Then you can train the model using the code mentioned below
model.fit(
    train_generator,
    steps_per_epoch = train_generator.samples // batch_size,
    validation_data = validation_generator, 
    validation_steps = validation_generator.samples // batch_size,
    epochs = nb_epochs)

希望这将解决您的不同Shapes的问题，因为它将确保Features和Labels具有相同的形状。如果这种方法导致错误，请分享更多信息。

祝您学习愉快！

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/60375181

复制

相似问题

问我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练
EN