<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html;

      charset=windows-1252">

  </head>

  <body>

    <div class="moz-cite-prefix">Will -- are you working on hammer? 

      Hammer should show a degradation after 24 processors.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">I wish it were that simple.  Hammer has

      three Intel Xeon E5-2650 CPU's.   Each of these has 8 cores and

      can accommodate up to 16 threads. There are in-processor

      algorothms and OS based algorithms (as well as your own code) that

      determine when to allow for threading and when to spread the work

      to another set of cores to avoid threading.  On a single CPU the 8

      cores should have their own cache lines (I have not looked at the

      technical drawings for the E5-2650).  When a cores on a single CPU

      start threading then they have to share the cache, which can

      degrade performance.   <br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">Competing for memory is a different

      issue altogether, but is tied to the ability of the cache lines to

      pull from memory quickly and align the data for processing. If you

      start to thread then you could have two threads pulling from

      different sections of memory that are not at all aligned.  This

      would cause a performance hit.  Depending on how the memory is

      constructed there will be varying degrees of a performance hit as

      well.   For example, if you had a memory structure that was

      completely random access and every single memory element was

      directly accessible, there would be less of a hit than if you had

      a system in which your code had to jump to a certain section of

      memory and then reference from the pointer to that section of

      memory.  Think about it this way -- if you had a 64 GB of memory 

      you could have all that on one memory stick or you could spread it

      across 4 sticks each with 16 GB each.  The first option would be

      FASTER because all of you memory is on a single chip.  The second

      would require going through the computer bus to access portions

      and would therefore slower.  So why don't we just all use the

      first method?  Because it is MUCH more expensive.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">Your basic premise is correct -- but I

      wanted to clarify.</div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">Hope that helps.<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">    <br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">On 4/16/20 10:06 PM, William Carl

      Baglivio wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:dc41acb3a3794849a9c0a60813dd368f@BN6PR01MB2228.prod.exchangelabs.com">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">Hey Dr. Pounds,</div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        I want to get the semantics right on this: the reason why the

        speedup declines after 20 processors is that there are 2 threads

        per core, so when there are more than 1 thread per core, they

        have to compete for memory. Am I getting that right? Anything I

        missed out on?</div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        <br>

      </div>

      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;

        font-size: 12pt; color: rgb(0, 0, 0);">

        ~Will B.</div>

    </blockquote>

    <p><br>

    </p>

    <pre class="moz-signature" cols="72">-- 

Andrew J. Pounds, Ph.D.  (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)

Professor of Chemistry and Computer Science

Director of the Computational Science Program

Mercer University,  Macon, GA 31207   (478) 301-5627

</pre>

  </body>

</html>