<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<font face="serif">Well apparently J.T. is the only one that found
the small bug that I introduced into the MPI matrix multiplication
code. I was really confused in class on Thursday because
apparently Steve and Tanner were getting the correct results, but
J.T. was getting the error. If you got the code to run you should
have noticed the following.<br>
<br>
</font>
<ol>
<li><font face="serif">The initial code (with the symmetric
matrices) ran great on a</font></li>
<ul>
<li><font face="serif">Single processor</font></li>
<li><font face="serif">Single node, multiple processors</font></li>
<li><font face="serif">Multiple nodes, single thread per node</font></li>
</ul>
<li><font face="serif">When you used the "accuracy check" matrices
you should have found that</font></li>
<ul>
<li><font face="serif">The code runs fine on a single processor</font></li>
<li><font face="serif">The code does not run correctly on a
single node with multiple processes</font></li>
<li><font face="serif">The code does not run correctly across
multiple nodes</font></li>
</ul>
</ol>
<font face="serif"><br>
The fact that you get a correct result on a single processor using
symmetric matrices, and <br>
a broken product when you use the non-symmetric matrices across
multiple processors should<br>
make you question the correctness of your WORKER PROCESS.<br>
<br>
Now there are lots of causes for this. There could be a problem
transferring data to the process,<br>
there could be memory issues or computational problems in the
worker process, or there could even be<br>
issues with retrieving the data and putting it back in the matrix
on the master node. However, since<br>
the problem only occurs in the worker, then I would check the
matrix multiplication code in the worker process.<br>
<br>
<br>
See if you can spot the error, fix it, and start benchmarking.
When I checked a few minutes ago all but one node in <br>
lab 100 was up. I'll try to swing by later today and get machine
2 up.<br>
<br>
<br>
Let me know if you need help.<br>
<br>
</font>
<pre class="moz-signature" cols="72">--
Andrew J. Pounds, Ph.D. (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)
Professor of Chemistry and Computer Science
Mercer University, Macon, GA 31207 (478) 301-5627
<a class="moz-txt-link-freetext" href="http://faculty.mercer.edu/pounds_aj">http://faculty.mercer.edu/pounds_aj</a>
</pre>
</body>
</html>