问openmp中的并行for循环
EN

Stack Overflow用户

提问于 2012-08-02 15:46:17

回答 2查看 70.9K关注 0票数 30

我正在尝试并行化一个非常简单的for循环，但这是我很长一段时间以来第一次尝试使用openMP。我对运行时间感到困惑。下面是我的代码：

#include <vector>
#include <algorithm>

using namespace std;

int main () 
{
    int n=400000,  m=1000;  
    double x=0,y=0;
    double s=0;
    vector< double > shifts(n,0);


    #pragma omp parallel for 
    for (int j=0; j<n; j++) {

        double r=0.0;
        for (int i=0; i < m; i++){

            double rand_g1 = cos(i/double(m));
            double rand_g2 = sin(i/double(m));     

            x += rand_g1;
            y += rand_g2;
            r += sqrt(rand_g1*rand_g1 + rand_g2*rand_g2);
        }
        shifts[j] = r / m;
    }

    cout << *std::max_element( shifts.begin(), shifts.end() ) << endl;
}

我用以下命令编译它

g++ -O3 testMP.cc -o testMP  -I /opt/boost_1_48_0/include

也就是说，没有"-fopenmp"，我得到的时间是：

real    0m18.417s
user    0m18.357s
sys     0m0.004s

当我用"-fopenmp“的时候

g++ -O3 -fopenmp testMP.cc -o testMP  -I /opt/boost_1_48_0/include

我为《泰晤士报》得到了这些数字：

real    0m6.853s
user    0m52.007s
sys     0m0.008s

这对我来说没什么意义。为什么使用八核只能带来3倍的性能提升？我是否正确地编写了循环代码？

c++

multithreading

performance

parallel-processing

openmp

回答 2

Stack Overflow用户

回答已采纳

发布于 2012-08-02 16:30:26

您应该将OpenMP reduction子句用于x和y

#pragma omp parallel for reduction(+:x,y)
for (int j=0; j<n; j++) {

    double r=0.0;
    for (int i=0; i < m; i++){

        double rand_g1 = cos(i/double(m));
        double rand_g2 = sin(i/double(m));     

        x += rand_g1;
        y += rand_g2;
        r += sqrt(rand_g1*rand_g1 + rand_g2*rand_g2);
    }
    shifts[j] = r / m;
}

使用reduction，每个线程在x和y中累积自己的部分和，最后将所有部分值相加，以获得最终值。

Serial version:
25.05s user 0.01s system 99% cpu 25.059 total
OpenMP version w/ OMP_NUM_THREADS=16:
24.76s user 0.02s system 1590% cpu 1.559 total

参见-超线性加速:)

票数 37

Stack Overflow用户

发布于 2012-08-02 15:51:40

您最多能实现的(！)是线性加速。现在我不记得哪个是来自linux的时间了，但我建议你使用time.h或(在c++ 11中) "chrono“，并直接从程序员那里测量运行时间。最好将整个代码打包成一个循环，运行10次，然后平均得到程序运行时间的近似值。

此外，你还有一个关于x，y的问题，它不符合并行编程中的数据局部性的范例。

票数 -3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/11773115

复制

相似问题

问openmp中的并行for循环
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问openmp中的并行for循环EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问openmp中的并行for循环
EN