[CSC 435] Working on DLS Parallelization
Andrew J. Pounds
pounds_aj at mercer.edu
Sat Apr 2 12:49:05 EDT 2016
I know that some of you are starting to hammer away at your DLS and ILS
parallelization this weekend. Since I sent you some pointers on the
iterative linear solver, let me do the same for the direct linear solver.
1. DO NOT just try to do a #pragma omp at the top of the main loop and
think that you will have any chance of it working.
2. DO use your skill at determining where the hotspots are in the code
(like double nested loops) and focus on making the outermost loop of
those hotspots parallel. This will then hopefully thread the code in
such a way that the innermost loops are optimally using the
vectorization instructions that are prsent on each core of the
processor. There are two distinct regions in the DLS code where this is
beneficial.
3. Pay very close attention to what you hold in shared space and what
you hold in private space -- otherwise you will get ludicrous results or
lots and lots of NAN's.
4. Don't forget the call to set_omp_numthreads in your code to determine
how many threads to use. Othewise you can't create the Amdahl's law
plot speedup curves.
5. I recommend that you get the OpenMP parallelization done first and
then use that to help guide how you write your Pthreads code.
--
Andrew J. Pounds, Ph.D. (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Mercer University, Macon, GA 31207 (478) 301-5627
http://faculty.mercer.edu/pounds_aj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc435/attachments/20160402/d2c8ea23/attachment.html>
More information about the csc435
mailing list