[CSC 435] OpenMP
Andrew J. Pounds
pounds_aj at mercer.edu
Fri Mar 27 16:44:35 EDT 2020
Thanks to Will for asking this question that required me do "deep dive"
into some of the updates and changes to OpenMP. In some of the old code
I gave you for matrix multiplication I used a very simplistic
parallelization scheme that OpenMP now has problems with parallelizing.
If you take time and look into reductions and using "omp for" pragmas
you can get some really nice speedups. For example...
void mmm_( int *threads, int *len, double *a, double *b, double *c ){
int i, j, k;
int veclen = *len;
double s;
// Set the number of threads to use here
omp_set_num_threads(*threads);
#pragma omp parallel shared(a,b,c,veclen) private(i,j,k) reduction(+:s)
{
#pragma omp for schedule(static)
for (i=0; i<veclen; i++) {
for (j=0; j<veclen; j++) {
*(c+(i*veclen+j)) = 0.0;
s = 0.0;
for (k=0;k<veclen;k++){
s += *(a+(i*veclen+k)) * *(b+(k*veclen+j));
}
*(c+(i*veclen+j)) = s;
}
}
}
}
--
Andrew J. Pounds, Ph.D. (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Director of the Computational Science Program
Mercer University, Macon, GA 31207 (478) 301-5627
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc435/attachments/20200327/18eb7d70/attachment.html>
More information about the csc435
mailing list