我认为你走的方向是对的。并发哈希表对于大量的元素(千)是有效的。尽管在运行算法之前仍然可以尝试保留足够的容量，并使用concurrent_unordered_set的负载因子(设置为1)，也可以尝试concurrent_hash_map (在没有访问器的情况下使用insert(value)更快，但也需要保留一些容量)。

tbb::combinable和tbb::enumerable_thread_specific都使用相同的后端实现。不同之处仅在于界面。文档给出了后者的示例，我对其进行了一些重新设计：

typedef tbb::enumerable_thread_specific< std::pair<int,int> > CounterType;
CounterType MyCounters (std::make_pair(0,0));

int main() {
    tbb::parallel_for( tbb::blocked_range<int>(0, 100000000),
      [](const tbb::blocked_range<int> &r) {
        CounterType::reference my_counter = MyCounters.local();
        ++my_counter.first; my_counter.second += r.size();
    });

    std::pair<int,int> sum = MyCounters.combine(
        [](std::pair<int,int> x, std::pair<int,int> y) {
            return std::make_pair(x.first+y.first, x.second+y.second);
        });
    printf("Total calls to operator() = %d, "
         "total iterations = %d\n", sum.first, sum.second);
}

最后，尝试另一种方法，在不需要其他方法(例如可组合)的情况下，使用tbb::parallel_reduce，而且简化主要是并行完成的(只有log顺序步骤，而组合线程特定值则需要对所有P元素进行顺序访问)。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/30275431

复制

相似问题

问使用可组合或enumerable_thread_specific的线程本地集？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用可组合或enumerable_thread_specific的线程本地集？EN