[CSC 435] Working on DLS Parallelization

Sat Apr 2 12:49:05 EDT 2016

I know that some of you are starting to hammer away at your DLS and ILS 
parallelization this weekend.  Since I sent you some pointers on the 
iterative linear solver, let me do the same for the direct linear solver.

1.  DO NOT just try to do a #pragma omp at the top of the main loop and 
think that you will have any chance of it working.

2.  DO use your skill at determining where the hotspots are in the code 
(like double nested loops) and focus on making the outermost loop of 
those hotspots parallel.  This will then hopefully thread the code in 
such a way that the innermost loops are optimally using the 
vectorization instructions that are prsent on each core of the 
processor.  There are two distinct regions in the DLS code where this is 
beneficial.

3.  Pay very close attention to what you hold in shared space and what 
you hold in private space -- otherwise you will get ludicrous results or 
lots and lots of NAN's.

4. Don't forget the call to set_omp_numthreads in your code to determine 
how many threads to use.  Othewise you can't create the Amdahl's law 
plot speedup curves.

5. I recommend that you get the OpenMP parallelization done first and 
then use that to help guide how you write your Pthreads code.

-- 
Andrew J. Pounds, Ph.D.  (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Mercer University,  Macon, GA 31207   (478) 301-5627
http://faculty.mercer.edu/pounds_aj

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc435/attachments/20160402/d2c8ea23/attachment.html>