<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><font face="serif">So I did not give you a dot.c function in the

        initial repo.  Here is a dot product serial code:</font></p>

    <p><font face="serif"><br>

      </font></p>

    <p><tt>#ifdef __cplusplus</tt><tt><br>

      </tt><tt>extern "C" {</tt><tt><br>

      </tt><tt>#endif</tt><tt><br>

      </tt><tt>    double dot_( int *threads, int *len, double *vec1,

        double *vec2);</tt><tt><br>

      </tt><tt>#ifdef __cplusplus</tt><tt><br>

      </tt><tt>    }</tt><tt><br>

      </tt><tt>#endif</tt><tt><br>

      </tt><tt><br>

      </tt><tt>/*  S E R I A L   C O D E  */</tt><tt><br>

      </tt><tt><br>

      </tt><tt>double dot_( int *threads, int *len, double *vec1, double

        *vec2) {</tt><tt><br>

      </tt><tt><br>

      </tt><tt>    int k;</tt><tt><br>

      </tt><tt>    int veclen = *len;</tt><tt><br>

      </tt><tt>    int mod;</tt><tt><br>

      </tt><tt>    double product;</tt><tt><br>

      </tt><tt><br>

      </tt><tt>    // Compute the dot product</tt><tt><br>

      </tt><tt><br>

      </tt><tt>    product = 0.0;</tt><tt><br>

      </tt><tt>    for (k=0;k&lt;veclen;k++){</tt><tt><br>

      </tt><tt>        product += *(vec1+k) * *(vec2+k);</tt><tt><br>

      </tt><tt>    }</tt><tt><br>

      </tt><tt>    return(product);</tt><tt><br>

      </tt><tt>}</tt><br>

    </p>

    <p><br>

    </p>

    <p>If you want to test this independently of you other codes, then I

      recommend that you make a second driver.  I am including a

      "dotdriver.f90" piece of code so you can see how to actually call

      the dot product function in fortran (as it returns a value like

      cputime and walltime).</p>

    <p><tt>program dotdriver </tt><tt><br>

      </tt><tt><br>

      </tt><tt>integer :: NDIM</tt><tt><br>

      </tt><tt><br>

      </tt><tt>real (kind=8) :: wall_start, wall_end</tt><tt><br>

      </tt><tt>real (kind=8) :: cpu_start, cpu_end</tt><tt><br>

      </tt><tt>real (kind=8) :: trace</tt><tt><br>

      </tt><tt><br>

      </tt><tt><br>

      </tt><tt>integer :: startval, stopval, stepval</tt><tt><br>

      </tt><tt>real (kind=8) :: walltime</tt><tt><br>

      </tt><tt>real (kind=8) :: cputime </tt><tt><br>

      </tt><tt>real (kind=8) :: dot </tt><tt><br>

      </tt><tt>external walltime, cputime, dot</tt><tt><br>

      </tt><tt><br>

      </tt><tt>character (len=8) :: carg1, carg2, carg3</tt><tt><br>

      </tt><tt><br>

      </tt><tt>real (kind=8), dimension(:), allocatable :: veca, vecb</tt><tt><br>

      </tt><tt>real (kind=8), dimension(:,:), allocatable :: matrixa,

        matrixb, matrixc</tt><tt><br>

      </tt><tt><br>

      </tt><tt>!modified to use command line arguments</tt><tt><br>

      </tt><tt><br>

      </tt><tt>call get_command_argument(1, carg1)</tt><tt><br>

      </tt><tt>call get_command_argument(2, carg2)</tt><tt><br>

      </tt><tt>call get_command_argument(3, carg3)</tt><tt><br>

      </tt><tt><br>

      </tt><tt>! Use Fortran internal files to convert command line

        arguments to ints</tt><tt><br>

      </tt><tt><br>

      </tt><tt>read (carg1,'(i8)') startval</tt><tt><br>

      </tt><tt>read (carg2,'(i8)') stopval</tt><tt><br>

      </tt><tt>read (carg3,'(i8)') stepval</tt><tt><br>

      </tt><tt> </tt><tt><br>

      </tt><tt>do iter = startval, stopval, stepval</tt><tt><br>

      </tt><tt>  </tt><tt><br>

      </tt><tt><br>

      </tt><tt>NDIM = iter</tt><tt><br>

      </tt><tt><br>

      </tt><tt>allocate ( veca(NDIM), stat=ierr)</tt><tt><br>

      </tt><tt>allocate ( vecb(NDIM), stat=ierr)</tt><tt><br>

      </tt><tt><br>

      </tt><tt><br>

      </tt><tt>do i = 1, NDIM </tt><tt><br>

      </tt><tt>     veca(i) = sqrt(dble(NDIM)) </tt><tt><br>

      </tt><tt>     vecb(i) = 1.0 / sqrt( dble(NDIM))</tt><tt><br>

      </tt><tt>enddo</tt><tt><br>

      </tt><tt><br>

      </tt><tt>wall_start = walltime()</tt><tt><br>

      </tt><tt>cpu_start = cputime()</tt><tt><br>

      </tt><tt><br>

      </tt><tt>trace = dot(1, NDIM, veca, vecb)</tt><tt><br>

      </tt><tt><br>

      </tt><tt>cpu_end = cputime()</tt><tt><br>

      </tt><tt>wall_end = walltime()</tt><tt><br>

      </tt><tt><br>

      </tt><tt>mflops  = 2*dble(NDIM)/ (cpu_end-cpu_start) / 1.0e6</tt><tt><br>

      </tt><tt>mflops2 = 2*dble(NDIM)/ (wall_end-wall_start)/ 1.0e6</tt><tt><br>

      </tt><tt> </tt><tt><br>

      </tt><tt>print *, NDIM, trace, cpu_end-cpu_start,

        wall_end-wall_start,  mflops, mflops2</tt><tt><br>

      </tt><tt><br>

      </tt><tt><br>

      </tt><tt>deallocate(veca)</tt><tt><br>

      </tt><tt>deallocate(vecb)</tt><tt><br>

      </tt><tt><br>

      </tt><tt>enddo</tt><tt><br>

      </tt><tt><br>

      </tt><tt><br>

      </tt><tt>end program dotdriver </tt><tt><br>

      </tt><tt> </tt><br>

    </p>

    <p>The dot product is SO FAST that it will be difficult to get

      reasonable results unless you use extremely long vectors.  I ran

      on hammer with vector lengths of 1 billion and it completed in

      milliseconds. This led to performance in the gigaflops range,

      which is what we expect.</p>

    <p>Be careful just going to extremely large integers for you

      dimensions -- remember, these are discrete machines and there is a

      limit on the size of the integer values.  Normal ints, according

      to /usr/include/limits.h, have a maximum of 2147483647.</p>

    <p>You can go larger -- but you will have to change the integer

      variable types.</p>

    <p><br>

    </p>

    <p>As always, let me know if you have any questions.</p>

    <p><br>

    </p>

    <p><br>

    </p>

    <pre class="moz-signature" cols="72">-- 

Andrew J. Pounds, Ph.D.  (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)

Professor of Chemistry and Computer Science

Director of the Computational Science Program

Mercer University,  Macon, GA 31207   (478) 301-5627

</pre>

  </body>

</html>