blocks|key|1047951|text|如果您希望避免复制数组，那么我建议您不要生成排列列表，而是遍历数组中的每个元素，并将其随机交换到数组中的另一个位置|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1047952|for+old_index+in+len(a):
++++new_index+=+numpy.random.randint(old_index%2B1)
++++a[old_index],+a[new_index]+=+a[new_index],+a[old_index]
++++b[old_index],+b[new_index]+=+b[new_index],+b[old_index]|code-block|syntax|javascript|1047953|这实现了Knuth-Fisher-Yates混洗算法。|1047954|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|L|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|M|8|@]|9|@]|A|$]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

If you want to avoid copying arrays, then I would suggest that instead of generating a permutation list, you go through every element in the array, and randomly swap it to another position in the array

<pre><code>for old_index in len(a):
 new_index = numpy.random.randint(old_index+1)
 a[old_index], a[new_index] = a[new_index], a[old_index]
 b[old_index], b[new_index] = b[new_index], b[old_index]
</code></pre>

This implements the Knuth-Fisher-Yates shuffle algorithm.

blocks|key|833164|text|你可以使用NumPy的array+indexing|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|833165|def+unison_shuffled_copies(a,+b):
++++assert+len(a)+==+len(b)
++++p+=+numpy.random.permutation(len(a))
++++return+a[p],+b[p]|code-block|syntax|javascript|833166|这将导致创建单独的统一混洗数组。|833167|entityMap|0|LINK|mutability|MUTABLE|url|https://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html^0|B|E|0|0|0|0^^$0|@$1|2|3|4|5|6|7|S|8|@]|9|@$A|T|B|U|1|V]]|C|$]]|$1|D|3|E|5|F|7|W|8|@]|9|@]|C|$G|H]]|$1|I|3|J|5|6|7|X|8|@]|9|@]|C|$]]|$1|K|3|-4|5|6|7|Y|8|@]|9|@]|C|$]]]|L|$M|$5|N|O|P|C|$Q|R]]]]

Your can use NumPy's <a href="https://docs.scipy.org/doc/numpy-1.10.1/user/basics.indexing.html">array indexing</a>:

<pre><code>def unison_shuffled_copies(a, b):
 assert len(a) == len(b)
 p = numpy.random.permutation(len(a))
 return a[p], b[p]
</code></pre>

This will result in creation of separate unison-shuffled arrays.

blocks|key|833234|text|你的“可怕”解决方案在我看来并不可怕。为两个相同长度的序列调用shuffle()会导致对随机数生成器的调用次数相同，这是混洗算法中唯一的“随机”元素。通过重置状态，您可以确保对随机数生成器的调用将在第二次调用shuffle()时产生相同的结果，因此整个算法将生成相同的排列。|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|833235|如果您不喜欢这样，一个不同的解决方案是将您的数据存储在一个数组中，而不是从一开始就存储两个数组，然后在这个单独的数组中创建两个视图，模拟您现在拥有的两个数组。您可以将单个数组用于混洗，将视图用于所有其他目的。|833236|示例:假设数组a和b如下所示：|833237|a+=+numpy.array([[[++0.,+++1.,+++2.],
++++++++++++++++++[++3.,+++4.,+++5.]],

+++++++++++++++++[[++6.,+++7.,+++8.],
++++++++++++++++++[++9.,++10.,++11.]],

+++++++++++++++++[[+12.,++13.,++14.],
++++++++++++++++++[+15.,++16.,++17.]]])

b+=+numpy.array([[+0.,++1.],
+++++++++++++++++[+2.,++3.],
+++++++++++++++++[+4.,++5.]])|code-block|syntax|javascript|833238|我们现在可以构造一个包含所有数据的数组：|833239|c+=+numpy.c_[a.reshape(len(a),+-1),+b.reshape(len(b),+-1)]
#+array([[++0.,+++1.,+++2.,+++3.,+++4.,+++5.,+++0.,+++1.],
#++++++++[++6.,+++7.,+++8.,+++9.,++10.,++11.,+++2.,+++3.],
#++++++++[+12.,++13.,++14.,++15.,++16.,++17.,+++4.,+++5.]])|833240|现在，我们创建模拟原始a和b的视图|833241|a2+=+c[:,+:a.size//len(a)].reshape(a.shape)
b2+=+c[:,+a.size//len(a):].reshape(b.shape)|833242|a2和b2的数据与c共享。要同时混洗两个数组，请使用numpy.random.shuffle(c)。|833243|在生产代码中，您当然会尽量避免创建原始的a和b，而立即创建c、a2和b2。|833244|这种解决方案可以适用于a和b具有不同数据类型的情况。|833245|entityMap^0|V|9|2W|9|0|0|7|1|9|1|0|0|0|0|B|1|D|1|0|0|0|2|3|2|9|1|Q|N|0|K|1|M|1|T|1|V|2|Y|2|0|B|1|D|1|0^^$0|@$1|2|3|4|5|6|7|14|8|@$9|15|A|16|B|C]|$9|17|A|18|B|C]]|D|@]|E|$]]|$1|F|3|G|5|6|7|19|8|@]|D|@]|E|$]]|$1|H|3|I|5|6|7|1A|8|@$9|1B|A|1C|B|C]|$9|1D|A|1E|B|C]]|D|@]|E|$]]|$1|J|3|K|5|L|7|1F|8|@]|D|@]|E|$M|N]]|$1|O|3|P|5|6|7|1G|8|@]|D|@]|E|$]]|$1|Q|3|R|5|L|7|1H|8|@]|D|@]|E|$M|N]]|$1|S|3|T|5|6|7|1I|8|@$9|1J|A|1K|B|C]|$9|1L|A|1M|B|C]]|D|@]|E|$]]|$1|U|3|V|5|L|7|1N|8|@]|D|@]|E|$M|N]]|$1|W|3|X|5|6|7|1O|8|@$9|1P|A|1Q|B|C]|$9|1R|A|1S|B|C]|$9|1T|A|1U|B|C]|$9|1V|A|1W|B|C]]|D|@]|E|$]]|$1|Y|3|Z|5|6|7|1X|8|@$9|1Y|A|1Z|B|C]|$9|20|A|21|B|C]|$9|22|A|23|B|C]|$9|24|A|25|B|C]|$9|26|A|27|B|C]]|D|@]|E|$]]|$1|10|3|11|5|6|7|28|8|@$9|29|A|2A|B|C]|$9|2B|A|2C|B|C]]|D|@]|E|$]]|$1|12|3|-4|5|6|7|2D|8|@]|D|@]|E|$]]]|13|$]]

Your "scary" solution does not appear scary to me. Calling <code>shuffle()</code> for two sequences of the same length results in the same number of calls to the random number generator, and these are the only "random" elements in the shuffle algorithm. By resetting the state, you ensure that the calls to the random number generator will give the same results in the second call to <code>shuffle()</code>, so the whole algorithm will generate the same permutation.

If you don't like this, a different solution would be to store your data in one array instead of two right from the beginning, and create two views into this single array simulating the two arrays you have now. You can use the single array for shuffling and the views for all other purposes.

Example: Let's assume the arrays <code>a</code> and <code>b</code> look like this:

<pre><code>a = numpy.array([[[ 0., 1., 2.],
 [ 3., 4., 5.]],

 [[ 6., 7., 8.],
 [ 9., 10., 11.]],

 [[ 12., 13., 14.],
 [ 15., 16., 17.]]])

b = numpy.array([[ 0., 1.],
 [ 2., 3.],
 [ 4., 5.]])
</code></pre>

We can now construct a single array containing all the data:

<pre><code>c = numpy.c_[a.reshape(len(a), -1), b.reshape(len(b), -1)]
# array([[ 0., 1., 2., 3., 4., 5., 0., 1.],
# [ 6., 7., 8., 9., 10., 11., 2., 3.],
# [ 12., 13., 14., 15., 16., 17., 4., 5.]])
</code></pre>

Now we create views simulating the original <code>a</code> and <code>b</code>:

<pre><code>a2 = c[:, :a.size//len(a)].reshape(a.shape)
b2 = c[:, a.size//len(a):].reshape(b.shape)
</code></pre>

The data of <code>a2</code> and <code>b2</code> is shared with <code>c</code>. To shuffle both arrays simultaneously, use <code>numpy.random.shuffle(c)</code>.

In production code, you would of course try to avoid creating the original <code>a</code> and <code>b</code> at all and right away create <code>c</code>, <code>a2</code> and <code>b2</code>.

This solution could be adapted to the case that <code>a</code> and <code>b</code> have different dtypes.

blocks|key|833268|text|X+=+np.array([[1.,+0.],+[2.,+1.],+[0.,+0.]])
y+=+np.array([0,+1,+2])
from+sklearn.utils+import+shuffle
X,+y+=+shuffle(X,+y,+random_state=0)|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|833269|要了解更多信息，请参阅http://scikit-learn.org/stable/modules/generated/sklearn.utils.shuffle.html|unstyled|offset|length|833270|entityMap|0|LINK|mutability|MUTABLE|url|http://scikit-learn.org/stable/modules/generated/sklearn.utils.shuffle.html^0|0|B|23|0|0^^$0|@$1|2|3|4|5|6|7|Q|8|@]|9|@]|A|$B|C]]|$1|D|3|E|5|F|7|R|8|@]|9|@$G|S|H|T|1|U]]|A|$]]|$1|I|3|-4|5|F|7|V|8|@]|9|@]|A|$]]]|J|$K|$5|L|M|N|A|$O|P]]]]

<pre><code>X = np.array([[1., 0.], [2., 1.], [0., 0.]])
y = np.array([0, 1, 2])
from sklearn.utils import shuffle
X, y = shuffle(X, y, random_state=0)
</code></pre>

To learn more, see <a href="http://scikit-learn.org/stable/modules/generated/sklearn.utils.shuffle.html" rel="noreferrer">http://scikit-learn.org/stable/modules/generated/sklearn.utils.shuffle.html</a>

blocks|key|1048120|text|举个例子，这就是我要做的：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1048121|combo+=+[]
for+i+in+range(60000):
++++combo.append((images[i],+labels[i]))

shuffle(combo)

im+=+[]
lab+=+[]
for+c+in+combo:
++++im.append(c[0])
++++lab.append(c[1])
images+=+np.asarray(im)
labels+=+np.asarray(lab)|code-block|syntax|javascript|1048122|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

With an example, this is what I'm doing:

<pre><code>combo = []
for i in range(60000):
 combo.append((images[i], labels[i]))

shuffle(combo)

im = []
lab = []
for c in combo:
 im.append(c[0])
 lab.append(c[1])
images = np.asarray(im)
labels = np.asarray(lab)
</code></pre>

blocks|key|1048164|text|非常简单的解决方案：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1048165|randomize+=+np.arange(len(x))
np.random.shuffle(randomize)
x+=+x[randomize]
y+=+y[randomize]|code-block|syntax|javascript|1048166|两个数组x，y现在都以相同的方式随机混洗。|1048167|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|L|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|M|8|@]|9|@]|A|$]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

Very simple solution:

<pre><code>randomize = np.arange(len(x))
np.random.shuffle(randomize)
x = x[randomize]
y = y[randomize]
</code></pre>

the two arrays x,y are now both randomly shuffled in the same way

blocks|key|1048231|text|我扩展了python的random.shuffle()以获取第二个参数：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1048232|def+shuffle_together(x,+y):
++++assert+len(x)+==+len(y)

++++for+i+in+reversed(xrange(1,+len(x))):
++++++++#+pick+an+element+in+x[:i%2B1]+with+which+to+exchange+x[i]
++++++++j+=+int(random.random()+*+(i%2B1))
++++++++x[i],+x[j]+=+x[j],+x[i]
++++++++y[i],+y[j]+=+y[j],+y[i]|code-block|syntax|javascript|1048233|这样，我可以确保混洗在适当的位置发生，并且函数不会太长或太复杂。|1048234|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|L|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|M|8|@]|9|@]|A|$]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

I extended python's random.shuffle() to take a second arg:

<pre><code>def shuffle_together(x, y):
 assert len(x) == len(y)

 for i in reversed(xrange(1, len(x))):
 # pick an element in x[:i+1] with which to exchange x[i]
 j = int(random.random() * (i+1))
 x[i], x[j] = x[j], x[i]
 y[i], y[j] = y[j], y[i]
</code></pre>

That way I can be sure that the shuffling happens in-place, and the function is not all too long or complicated.

blocks|key|1048279|text|可以对连接列表进行就地混洗的一种方法是使用种子(可以是随机的)，并使用numpy.random.shuffle进行混洗。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1048280|#+Set+seed+to+a+random+number+if+you+want+the+shuffling+to+be+non-deterministic.
def+shuffle(a,+b,+seed):
+++np.random.seed(seed)
+++np.random.shuffle(a)
+++np.random.seed(seed)
+++np.random.shuffle(b)|code-block|syntax|javascript|1048281|就这样。这将以完全相同的方式对a和b进行混洗。这也是就地完成的，这总是一个优点。|1048282|编辑，不要使用np.random.seed()，请使用np.random.RandomState|1048283|def+shuffle(a,+b,+seed):
+++rand_state+=+np.random.RandomState(seed)
+++rand_state.shuffle(a)
+++rand_state.seed(seed)
+++rand_state.shuffle(b)|1048284|当调用它时，只需传入任何种子来提供随机状态：|1048285|a+=+[1,2,3,4]
b+=+[11,+22,+33,+44]
shuffle(a,+b,+12345)|1048286|输出：|1048287|>>>+a
[1,+4,+2,+3]
>>>+b
[11,+44,+22,+33]|1048288|编辑:修复了重新设定随机状态的代码|1048289|entityMap^0|0|0|0|0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|Y|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|Z|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|10|8|@]|9|@]|A|$]]|$1|I|3|J|5|6|7|11|8|@]|9|@]|A|$]]|$1|K|3|L|5|D|7|12|8|@]|9|@]|A|$E|F]]|$1|M|3|N|5|6|7|13|8|@]|9|@]|A|$]]|$1|O|3|P|5|D|7|14|8|@]|9|@]|A|$E|F]]|$1|Q|3|R|5|6|7|15|8|@]|9|@]|A|$]]|$1|S|3|T|5|D|7|16|8|@]|9|@]|A|$E|F]]|$1|U|3|V|5|6|7|17|8|@]|9|@]|A|$]]|$1|W|3|-4|5|6|7|18|8|@]|9|@]|A|$]]]|X|$]]

One way in which in-place shuffling can be done for connected lists is using a seed (it could be random) and using numpy.random.shuffle to do the shuffling.

<pre><code># Set seed to a random number if you want the shuffling to be non-deterministic.
def shuffle(a, b, seed):
 np.random.seed(seed)
 np.random.shuffle(a)
 np.random.seed(seed)
 np.random.shuffle(b)
</code></pre>

That's it. This will shuffle both a and b in the exact same way. This is also done in-place which is always a plus.

<h3>EDIT, don't use np.random.seed() use np.random.RandomState instead</h3>

<pre><code>def shuffle(a, b, seed):
 rand_state = np.random.RandomState(seed)
 rand_state.shuffle(a)
 rand_state.seed(seed)
 rand_state.shuffle(b)
</code></pre>

When calling it just pass in any seed to feed the random state:

<pre><code>a = [1,2,3,4]
b = [11, 22, 33, 44]
shuffle(a, b, 12345)
</code></pre>

Output:

<pre><code>&gt;&gt;&gt; a
[1, 4, 2, 3]
&gt;&gt;&gt; b
[11, 44, 22, 33]
</code></pre>

Edit: Fixed code to re-seed the random state

blocks|key|1048337|text|您可以创建一个如下所示的数组：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1048338|s+=+np.arange(0,+len(a),+1)|code-block|syntax|javascript|1048339|然后重新洗牌：|1048340|np.random.shuffle(s)|1048341|现在使用这个s作为数组的参数。相同的随机参数返回相同的随机向量。|1048342|x_data+=+x_data[s]
x_label+=+x_label[s]|1048343|entityMap^0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|Q|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|R|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|S|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|T|8|@]|9|@]|A|$E|F]]|$1|K|3|L|5|6|7|U|8|@]|9|@]|A|$]]|$1|M|3|N|5|D|7|V|8|@]|9|@]|A|$E|F]]|$1|O|3|-4|5|6|7|W|8|@]|9|@]|A|$]]]|P|$]]

you can make an array like:

<pre><code>s = np.arange(0, len(a), 1)
</code></pre>

then shuffle it:

<pre><code>np.random.shuffle(s)
</code></pre>

now use this s as argument of your arrays. same shuffled arguments return same shuffled vectors.

<pre><code>x_data = x_data[s]
x_label = x_label[s]
</code></pre>

blocks|key|1048364|text|詹姆斯在2015年写了一篇sklearn+solution，这很有帮助。但他添加了一个随机状态变量，这是不必要的。在下面的代码中，自动假定numpy的随机状态。|type|unstyled|depth|inlineStyleRanges|entityRanges|offset|length|data|1048365|X+=+np.array([[1.,+0.],+[2.,+1.],+[0.,+0.]])
y+=+np.array([0,+1,+2])
from+sklearn.utils+import+shuffle
X,+y+=+shuffle(X,+y)|code-block|syntax|javascript|1048366|entityMap|0|LINK|mutability|MUTABLE|url|https://stackoverflow.com/a/30633632/3441514^0|L|8|0|0|0^^$0|@$1|2|3|4|5|6|7|Q|8|@]|9|@$A|R|B|S|1|T]]|C|$]]|$1|D|3|E|5|F|7|U|8|@]|9|@]|C|$G|H]]|$1|I|3|-4|5|6|7|V|8|@]|9|@]|C|$]]]|J|$K|$5|L|M|N|C|$O|P]]]]

James wrote in 2015 an sklearn <a href="https://stackoverflow.com/a/30633632/3441514">solution</a> which is helpful. But he added a random state variable, which is not needed. In the below code, the random state from numpy is automatically assumed.

<pre><code>X = np.array([[1., 0.], [2., 1.], [0., 0.]])
y = np.array([0, 1, 2])
from sklearn.utils import shuffle
X, y = shuffle(X, y)
</code></pre>

blocks|key|833645|text|假设我们有两个数组:a和b。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|833646|a+=+np.array([[1,2,3],[4,5,6],[7,8,9]])
b+=+np.array([[9,1,1],[6,6,6],[4,2,0]])+|code-block|syntax|javascript|833647|我们可以首先通过排列第一维来获得行索引|833648|indices+=+np.random.permutation(a.shape[0])
[1+2+0]|833649|然后使用高级索引。在这里，我们使用相同的索引一致地对两个数组进行混洗。|833650|a_shuffled+=+a[indices[:,np.newaxis],+np.arange(a.shape[1])]
b_shuffled+=+b[indices[:,np.newaxis],+np.arange(b.shape[1])]|833651|这相当于|833652|np.take(a,+indices,+axis=0)
[[4+5+6]
+[7+8+9]
+[1+2+3]]

np.take(b,+indices,+axis=0)
[[6+6+6]
+[4+2+0]
+[9+1+1]]|833653|entityMap^0|0|0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|U|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|V|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|W|8|@]|9|@]|A|$]]|$1|I|3|J|5|D|7|X|8|@]|9|@]|A|$E|F]]|$1|K|3|L|5|6|7|Y|8|@]|9|@]|A|$]]|$1|M|3|N|5|D|7|Z|8|@]|9|@]|A|$E|F]]|$1|O|3|P|5|6|7|10|8|@]|9|@]|A|$]]|$1|Q|3|R|5|D|7|11|8|@]|9|@]|A|$E|F]]|$1|S|3|-4|5|6|7|12|8|@]|9|@]|A|$]]]|T|$]]

Say we have two arrays: a and b. 

<pre><code>a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([[9,1,1],[6,6,6],[4,2,0]]) 
</code></pre>

We can first obtain row indices by permutating first dimension 

<pre><code>indices = np.random.permutation(a.shape[0])
[1 2 0]
</code></pre>

Then use advanced indexing.
Here we are using the same indices to shuffle both arrays in unison. 

<pre><code>a_shuffled = a[indices[:,np.newaxis], np.arange(a.shape[1])]
b_shuffled = b[indices[:,np.newaxis], np.arange(b.shape[1])]
</code></pre>

This is equivalent to

<pre><code>np.take(a, indices, axis=0)
[[4 5 6]
 [7 8 9]
 [1 2 3]]

np.take(b, indices, axis=0)
[[6 6 6]
 [4 2 0]
 [9 1 1]]
</code></pre>

blocks|key|833655|text|from+np.random+import+permutation
from+sklearn.datasets+import+load_iris
iris+=+load_iris()
X+=+iris.data+#numpy+array
y+=+iris.target+#numpy+array

#+Data+is+currently+unshuffled;+we+should+shuffle+
#+each+X[i]+with+its+corresponding+y[i]
perm+=+permutation(len(X))
X+=+X[perm]
y+=+y[perm]|type|code-block|depth|inlineStyleRanges|entityRanges|data|syntax|javascript|833656|unstyled|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|G|8|@]|9|@]|A|$B|C]]|$1|D|3|-4|5|E|7|H|8|@]|9|@]|A|$]]]|F|$]]

<pre><code>from np.random import permutation
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data #numpy array
y = iris.target #numpy array

# Data is currently unshuffled; we should shuffle 
# each X[i] with its corresponding y[i]
perm = permutation(len(X))
X = X[perm]
y = y[perm]
</code></pre>

blocks|key|1048447|text|只需使用numpy即可。|type|unstyled|depth|inlineStyleRanges|offset|length|style|CODE|entityRanges|data|1048448|首先合并两个输入数组，一维数组是标签(Y)，二维数组是数据(X)，并用NumPy+shuffle方法对它们进行混洗。最后将它们分开并返回。|1048449|import+numpy+as+np

def+shuffle_2d(a,+b):
++++rows=+a.shape[0]
++++if+b.shape+!=+(rows,1):
++++++++b+=+b.reshape((rows,1))
++++S+=+np.hstack((b,a))
++++np.random.shuffle(S)
++++b,+a++=+S[:,0],+S[:,1:]
++++return+a,b

features,+samples+=+2,+5
x,+y+=+np.random.random((samples,+features)),+np.arange(samples)
x,+y+=+shuffle_2d(train,+test)|code-block|syntax|javascript|1048450|entityMap^0|4|5|0|15|7|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@$9|P|A|Q|B|C]]|D|@]|E|$]]|$1|F|3|G|5|6|7|R|8|@$9|S|A|T|B|C]]|D|@]|E|$]]|$1|H|3|I|5|J|7|U|8|@]|D|@]|E|$K|L]]|$1|M|3|-4|5|6|7|V|8|@]|D|@]|E|$]]]|N|$]]

Just use <code>numpy</code>...

First merge the two input arrays 1D array is labels(y) and 2D array is data(x) and shuffle them with NumPy <code>shuffle</code> method. Finally split them and return.

<pre class="lang-py prettyprint-override"><code>import numpy as np

def shuffle_2d(a, b):
 rows= a.shape[0]
 if b.shape != (rows,1):
 b = b.reshape((rows,1))
 S = np.hstack((b,a))
 np.random.shuffle(S)
 b, a = S[:,0], S[:,1:]
 return a,b

features, samples = 2, 5
x, y = np.random.random((samples, features)), np.arange(samples)
x, y = shuffle_2d(train, test)
</code></pre>

blocks|key|1048488|text|这似乎是一个非常简单的解决方案：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|1048489|import+numpy+as+np
def+shuffle_in_unison(a,b):

++++assert+len(a)==len(b)
++++c+=+np.arange(len(a))
++++np.random.shuffle(c)

++++return+a[c],b[c]

a+=++np.asarray([[1,+1],+[2,+2],+[3,+3]])
b+=++np.asarray([11,+22,+33])

shuffle_in_unison(a,b)
Out[94]:+
(array([[3,+3],
++++++++[2,+2],
++++++++[1,+1]]),
+array([33,+22,+11]))|code-block|syntax|javascript|1048490|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

This seems like a very simple solution:

<pre><code>import numpy as np
def shuffle_in_unison(a,b):

 assert len(a)==len(b)
 c = np.arange(len(a))
 np.random.shuffle(c)

 return a[c],b[c]

a = np.asarray([[1, 1], [2, 2], [3, 3]])
b = np.asarray([11, 22, 33])

shuffle_in_unison(a,b)
Out[94]: 
(array([[3, 3],
 [2, 2],
 [1, 1]]),
 array([33, 22, 11]))
</code></pre>

I have two numpy arrays of different shapes, but with the same length (leading dimension). I want to shuffle each of them, such that corresponding elements continue to correspond -- i.e. shuffle them in unison with respect to their leading indices.

This code works, and illustrates my goals:

<pre><code>def shuffle_in_unison(a, b):
 assert len(a) == len(b)
 shuffled_a = numpy.empty(a.shape, dtype=a.dtype)
 shuffled_b = numpy.empty(b.shape, dtype=b.dtype)
 permutation = numpy.random.permutation(len(a))
 for old_index, new_index in enumerate(permutation):
 shuffled_a[new_index] = a[old_index]
 shuffled_b[new_index] = b[old_index]
 return shuffled_a, shuffled_b
</code></pre>

For example:

<pre><code>&gt;&gt;&gt; a = numpy.asarray([[1, 1], [2, 2], [3, 3]])
&gt;&gt;&gt; b = numpy.asarray([1, 2, 3])
&gt;&gt;&gt; shuffle_in_unison(a, b)
(array([[2, 2],
 [1, 1],
 [3, 3]]), array([2, 1, 3]))
</code></pre>

However, this feels clunky, inefficient, and slow, and it requires making a copy of the arrays -- I'd rather shuffle them in-place, since they'll be quite large.

Is there a better way to go about this? Faster execution and lower memory usage are my primary goals, but elegant code would be nice, too.

One other thought I had was this:

<pre><code>def shuffle_in_unison_scary(a, b):
 rng_state = numpy.random.get_state()
 numpy.random.shuffle(a)
 numpy.random.set_state(rng_state)
 numpy.random.shuffle(b)
</code></pre>

This works...but it's a little scary, as I see little guarantee it'll continue to work -- it doesn't look like the sort of thing that's guaranteed to survive across numpy version, for example.

Better way to shuffle two numpy arrays in unison

我有两个不同形状的numpy数组，但长度(前导维度)相同。我想对它们中的每一个进行混洗，以便相应的元素继续对应--即根据它们的领先索引对它们进行一致的混洗。这段代码可以工作，并说明了我的目标：def shuffle_in_unison(a, b):    assert len(a) == len(b)    shuff...

问统一混洗两个numpy数组的更好方法
EN

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问统一混洗两个numpy数组的更好方法EN

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问统一混洗两个numpy数组的更好方法
EN