首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练

我的训练数据和标签具有不同的numpy数组形状。它打乱了我的训练
EN

Stack Overflow用户
提问于 2020-02-24 19:40:21
回答 1查看 372关注 0票数 1

我有一个基于图像的数据库,我正在与之合作,并试图将其转换为一个数字数组。然后我将使用它作为cGAN输入。我试过使用多个代码,但它们都给我带来了混乱的问题。不知道该怎么做

代码语言:javascript
运行
复制
training_data = []
IMG_SIZE = 32
datadir = 'drive/My Drive/dummyDS'  
CATEGORIES = ['HTC-1-M7', 'IPhone-4s', 'iPhone-6', 'LG-Nexus-5x', 
              'Motorola-Droid-Max', 'Motorola-Nexus-6', 'Motorola-X', 
              'Samsung-Galaxy-Note3', 'Samsung-Galaxy-S4', 'Sony-Nex-7']

def create_training_data():
    i=0
    for category in CATEGORIES:
        path=os.path.join(datadir,category)
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
          img_array=cv2.imread(os.path.join(path,img))
          new_array=cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
          training_data.append([new_array,class_num])
          plt.imshow(img_array,cmap="gray")
          plt.imshow(new_array,cmap="gray")
          plt.show() 
create_training_data()
代码语言:javascript
运行
复制
X=[]
y=[]
random.shuffle(training_data)

for features,label in training_data:
    X.append(features)
    y.append(label)

X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
pickle_out = open("X.pickle","wb")
pickle.dump(X, pickle_out)
pickle_out.close()

y = np.array(y)
pickle_out = open("y.pickle","wb")
pickle.dump(y, pickle_out)
pickle_out.close()
代码语言:javascript
运行
复制
y = to_categorical(y)

# saving the y_labels_one_hot array as a .npy file
np.save('y_labels_one_hot.npy', y)
代码语言:javascript
运行
复制
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=2./11)

X_train.shape=(32,32,32,3)和y_train.shape= (32,4,2)

现在在训练中我得到了

代码语言:javascript
运行
复制
real_labels=to_categorical(Y_train[i*batch_size:(i+1)*batch_size].reshape(-1,1),num_classes=10)
        d_loss_real = discriminator.train_on_batch(x=[X_batch, real_labels],
                                                   y=real * (1 - smooth))
代码语言:javascript
运行
复制
ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(32, 32, 32, 3), (256, 10)]
EN

回答 1

Stack Overflow用户

发布于 2020-05-08 23:03:52

tensorflow.keras.imagedatagenerator.flow_from_directory应该可以简化您的任务。

它以一种更简单的方式完成您使用您提到的代码所做的几乎所有事情,包括Splitting数据

上面提到的代码演示了如何使用它,以及每行代码的详细解释

代码语言:javascript
运行
复制
train_datagen = ImageDataGenerator(rescale=1./255, # Normalizes every pixel value
    validation_split=0.2) # Setting Validation Data as 20% of Total Data

train_generator = train_datagen.flow_from_directory(
    datadir, # Traverses through all the Sub Folders (Category) inside this dir
    target_size=(img_height, img_width), # Sets the Image Size
    batch_size=batch_size, # Generates batches of `batch_size`
    class_mode='categorical', # Will Consider Labels as Categorical
    shuffle = True, # Shuffles the Data
    subset='training') # Considers 80% as training data

# Since we don't have separate directory for Validation Data and since we want the Total Data to be Partitioned, we should use "train_datagen"
validation_generator = train_datagen.flow_from_directory(
    datadir , # Should use the Same Dir as Training for Splitting
    target_size=(img_height, img_width), 
    batch_size=batch_size,
    class_mode='categorical',
    shuffle = True, # Shuffles the Data
    subset='validation') # Considers 20% as Validation data

# Then you can train the model using the code mentioned below
model.fit(
    train_generator,
    steps_per_epoch = train_generator.samples // batch_size,
    validation_data = validation_generator, 
    validation_steps = validation_generator.samples // batch_size,
    epochs = nb_epochs)

希望这将解决您的不同Shapes的问题,因为它将确保FeaturesLabels具有相同的形状。如果这种方法导致错误,请分享更多信息。

祝您学习愉快!

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60375181

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档