最好先将数据转换为numpy数组的格式。
方法一:使用np.random.shuffle
state = np.random.get_state()
np.random.shuffle(train)
np.random.set_state(state)
np.random.shuffle(label)
或者这么使用:
需要注意的是,如果数组类型是:['a','b','c','d'],(4,)
我们要先将其转换为[['a'],['b'],['c'],['d']],(4,1)
train_row = list(range(len(train_label)))
random.shuffle(train_row)
train_image = train_image[train_row,:]
train_label = train_label[train_row,:]
方法二:使用np.random.permutation()
shuffle_ix = np.random.permutation(np.arange(len(train_data)))
train_data = train_data[shuffle_ix,:]
train_label = train_label[shuffle_ix,:]
方法三:使用pytorch中的Dataset,还可以设置batchsize的大小
dataset = torch.utils.data.TensorDataset(data, target) # 设置数据集
train_iter = torch.utils.data.DataLoader(dataset, batch_size, shuffle=True) # 设置获取数据方式
举个例子:
import numpy as np
tes = np.array([['a'],['b'],['c'],['d']])
shuffle_ix = np.random.permutation(len(tes))
shuffle_ix = list(shuffle_ix)
print(shuffle_ix)
tes = tes[shuffle_ix,:]
[1, 3, 0, 2]
array([['b'],
['d'],
['a'],
['c']], dtype='<U1')
参考:
https://blog.csdn.net/sinat_38682860/article/details/108813209