问给定一个大小为n+1的整数数组，该数组由元素[1，n]组成。除k次重复的元素外，所有元素都是唯一的。
EN

Stack Overflow用户

提问于 2022-06-03 12:20:23

回答 4查看 915关注 0票数 1

我一直试图解决以下问题：

您将得到一个n+1整数数组，其中所有元素都位于[1,n]中。您还会得到其中一个元素被重复了一定次数，而其他元素则是不同的。开发一种算法，以同时找到重复的数字和重复的次数。

下面是我的解决方案，让k=重复次数：

struct LatticePoint{ // to hold duplicate and k
   int a;
   int b;
   LatticePoint(int a_, int b_) : a(a_), b(b_) {}
   }
   
LatticePoint findDuplicateAndK(const std::vector<int>& A){

int n = A.size() - 1;
std::vector<int> Numbers (n);

for(int i = 0; i < n + 1; ++i){
   ++Numbers[A[i] - 1]; // A[i] in range [1,n] so no out-of-access
   }
   
int i = 0;
while(i < n){
   if(Numbers[i] > 1) {
      int duplicate = i + 1;
      int k = Numbers[i] - 1;
      LatticePoint result{duplicate, k};
      return LatticePoint;
}

因此，基本思想是:我们沿着数组，每次看到数字A[i]，就会增加Numbers[A[i]]的值。由于只有重复出现过一次以上，所以数值大于1的数字输入的索引必须是具有重复次数的重复数- 1. O(n)在时间复杂度上和空间上的O(n)算法。

我想知道是否有人有更好的时间和/或空间的解决方案？(或者实际上如果我的解决方案中有任何错误.)

duplicates

arrays

algorithm

回答 4

Stack Overflow用户

发布于 2022-06-03 17:13:07

如果您有或愿意编写一个运行时指定大小的n (请参阅boost::dynamic_bitset)，则可以将划痕空间缩减为n int，而不是nint。

您不需要收集重复计数，直到您知道哪个元素是重复的，然后您只需要保持该计数。因此，您需要跟踪的是您以前是否见过该值(因此，n位)。找到复制的值后，将count设置为2，并在向量的其余部分中运行，每次单击该值的实例时递增count。(您将count初始化为2，因为当您到达那里时，您将看到其中的两个。)

这仍然是O(n)空间，但常数因子要小得多。

票数 2

Stack Overflow用户

发布于 2022-06-03 13:03:40

你的代码的想法是可行的。

但是，多亏了n+1元素，我们可以实现时间和空间的其他权衡。

如果我们有一定数量的桶，我们将数字除以，将n+1数放入其中，这意味着一些桶必须得到比预期更多的数据。这是众所周知的针孔原理的一个变体.

所以我们使用两个桶，一个用于范围1..floor(n/2)，一个用于floor(n/2)+1..n。经过一遍数组后，我们知道答案在哪一半。然后我们把这一半分成两半，再过一遍，以此类推。这将导致二进制搜索，该搜索将获得O(1)数据的答案，with ceil(log_2(n))通过，每个搜索都需要时间O(n)。因此，我们在时间O(n log(n))中得到了答案。

现在我们不需要用两个水桶了。如果我们使用3，我们将采取ceil(log_3(n))通行证。因此，当我们增加固定数量的桶时，我们会占用更多的空间并节省时间。还有其他的取舍吗？

好吧，您展示了如何使用n桶在1次传递中做到这一点。你需要多少桶才能在两次传球中完成？结果，答案至少是sqrt(n)的口头禅。用立方体根可以进行3次传递。诸若此类。

所以你会得到一系列的权衡，你拥有的桶越多，你需要的空间就越多，但传球越少。而你的解决方案只是在极端的尽头，占用了最多的空间和最少的时间。

票数 1

Stack Overflow用户

发布于 2022-06-03 22:17:41

这是一个更厚颜无耻的算法，它只需要恒定的空间，但需要重新排列输入向量。(它只是重新排序；所有原始元素在末尾仍然存在。)

现在仍然是O(n)时间，尽管这可能不是很明显。其想法是尝试重新排列数组，使Ai是我，直到我们找到复制。当我们尝试将一个元素放在正确的索引上时，这个重复就会出现，结果是这个索引已经保存了那个元素。这样，我们就找到了复制；我们有一个要移到Aj的值，但是相同的值已经在Aj了。然后，我们扫描数组的其余部分，每次找到另一个实例时都会增加计数。

#include <utility>
#include <vector>
std::pair<int, int> count_dup(std::vector<int> A) {
  /* Try to put each element in its "home" position (that is,
   * where the value is the same as the index). Since the
   * values start at 1, A[0] isn't home to anyone, so we start
   * the loop at 1.
   */
  int n = A.size();
  for (int i = 1; i < n; ++i) {
    while (A[i] != i) {
      int j = A[i];
      if (A[j] == j) {
        /* j is the duplicate. Now we need to count them.
         * We have one at i. There's one at j, too, but we only
         * need to add it if we're not going to run into it in
         * the scan. And there might be one at position 0. After that,
         * we just scan through the rest of the array.
         */
        int count = 1;
        if (A[0] == j) ++count;
        if (j < i) ++count;
        for (++i; i < n; ++i) {
          if (A[i] == j) ++count;
        }
        return std::make_pair(j, count);
      }
      /* This swap can only happen once per element. */
      std::swap(A[i], A[j]);
    }
  }
  /* If we get here, every element from 1 to n is at home.
   * So the duplicate must be A[0], and the duplicate count
   * must be 2.
   */
  return std::make_pair(A[0], 2);
}