<!DOCTYPE html>

<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">Your task in the current exercise

      figuring out to optimally use the OpenMP system in EACH of those

      running processes.  Remember -- all you have to get done for this

      exercise is to demonstrate that your MPI code can be sped up by

      using it in combination with OpenMP.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">You will need to compile using mpicc,

      but because I built MPI off of the newest GCC compiler, now you

      can also turn on the OpenMP stuff...</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">mpicc  -fopenmp mmm_mpi.c -o mmm_mpi

      -lm -lgomp -lopenblas<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">You can then specify the number of

      threads per system via a command line argument OR via the

      OMP_NUM_THREADS environment variables.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">My recommendations...  <br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <blockquote>

      <div class="moz-cite-prefix">As you saw from my graph last week,

        you should be able to see speedups on less than 10 systems using

        MPI alone.   There is no need at this point to try and allocate

        all of the nodes of the cluster (that comes in the next

        assignment).  There is also no need, at this point, to try and

        run scripts that exhaustively run all possible combinations of

        nodes and processes per node.  A handful of well thought out

        PBS/Torque jobs should be sufficient -- but you need to run

        enough to prove your claim graphically.<br>

      </div>

      <div class="moz-cite-prefix"><br>

      </div>

      <div class="moz-cite-prefix">There are LOTS of flags that you can

        play with both for MPI and OpenMP.  Keep is simple and follow my

        examples.  However, one that you will most likely have to use is

        the one we saw when we tried to schedule more threads than cores

        -- the "oversubscribe" flag on the mpirun command line.  MPI may

        need this to allow more than one OpenMP process to run

        concurrently.   <br>

      </div>

      <div class="moz-cite-prefix"><br>

      </div>

      <div class="moz-cite-prefix">As you have already seen - contention

        for the nodes could get to be interesting as we move toward the

        final project.  Don't put this off.<br>

      </div>

    </blockquote>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <p></p>

    <div class="moz-signature">-- <br>

      <b><i>Andrew J. Pounds, Ph.D.</i></b><br>

      <i>Professor of Chemistry and Computer Science</i><br>

      <i>Director of the Computational Science Program</i><br>

      <i>Mercer University, Macon, GA 31207 (478) 301-5627</i></div>

  </body>

</html>