<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
<div class="moz-cite-prefix">Will -- are you working on hammer?
Hammer should show a degradation after 24 processors.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">I wish it were that simple. Hammer has
three Intel Xeon E5-2650 CPU's. Each of these has 8 cores and
can accommodate up to 16 threads. There are in-processor
algorothms and OS based algorithms (as well as your own code) that
determine when to allow for threading and when to spread the work
to another set of cores to avoid threading. On a single CPU the 8
cores should have their own cache lines (I have not looked at the
technical drawings for the E5-2650). When a cores on a single CPU
start threading then they have to share the cache, which can
degrade performance. <br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Competing for memory is a different
issue altogether, but is tied to the ability of the cache lines to
pull from memory quickly and align the data for processing. If you
start to thread then you could have two threads pulling from
different sections of memory that are not at all aligned. This
would cause a performance hit. Depending on how the memory is
constructed there will be varying degrees of a performance hit as
well. For example, if you had a memory structure that was
completely random access and every single memory element was
directly accessible, there would be less of a hit than if you had
a system in which your code had to jump to a certain section of
memory and then reference from the pointer to that section of
memory. Think about it this way -- if you had a 64 GB of memory
you could have all that on one memory stick or you could spread it
across 4 sticks each with 16 GB each. The first option would be
FASTER because all of you memory is on a single chip. The second
would require going through the computer bus to access portions
and would therefore slower. So why don't we just all use the
first method? Because it is MUCH more expensive.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Your basic premise is correct -- but I
wanted to clarify.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Hope that helps.<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"> <br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 4/16/20 10:06 PM, William Carl
Baglivio wrote:<br>
</div>
<blockquote type="cite"
cite="mid:dc41acb3a3794849a9c0a60813dd368f@BN6PR01MB2228.prod.exchangelabs.com">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">Hey Dr. Pounds,</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
I want to get the semantics right on this: the reason why the
speedup declines after 20 processors is that there are 2 threads
per core, so when there are more than 1 thread per core, they
have to compete for memory. Am I getting that right? Anything I
missed out on?</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif;
font-size: 12pt; color: rgb(0, 0, 0);">
~Will B.</div>
</blockquote>
<p><br>
</p>
<pre class="moz-signature" cols="72">--
Andrew J. Pounds, Ph.D. (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)
Professor of Chemistry and Computer Science
Director of the Computational Science Program
Mercer University, Macon, GA 31207 (478) 301-5627
</pre>
</body>
</html>