[CSC 435] DOT product function example

Andrew J. Pounds pounds_aj at mercer.edu
Wed Mar 18 11:11:18 EDT 2020


So I did not give you a dot.c function in the initial repo.  Here is a
dot product serial code:


#ifdef __cplusplus
extern "C" {
#endif
    double dot_( int *threads, int *len, double *vec1, double *vec2);
#ifdef __cplusplus
    }
#endif

/*  S E R I A L   C O D E  */

double dot_( int *threads, int *len, double *vec1, double *vec2) {

    int k;
    int veclen = *len;
    int mod;
    double product;

    // Compute the dot product

    product = 0.0;
    for (k=0;k<veclen;k++){
        product += *(vec1+k) * *(vec2+k);
    }
    return(product);
}


If you want to test this independently of you other codes, then I
recommend that you make a second driver.  I am including a
"dotdriver.f90" piece of code so you can see how to actually call the
dot product function in fortran (as it returns a value like cputime and
walltime).

program dotdriver

integer :: NDIM

real (kind=8) :: wall_start, wall_end
real (kind=8) :: cpu_start, cpu_end
real (kind=8) :: trace


integer :: startval, stopval, stepval
real (kind=8) :: walltime
real (kind=8) :: cputime
real (kind=8) :: dot
external walltime, cputime, dot

character (len=8) :: carg1, carg2, carg3

real (kind=8), dimension(:), allocatable :: veca, vecb
real (kind=8), dimension(:,:), allocatable :: matrixa, matrixb, matrixc

!modified to use command line arguments

call get_command_argument(1, carg1)
call get_command_argument(2, carg2)
call get_command_argument(3, carg3)

! Use Fortran internal files to convert command line arguments to ints

read (carg1,'(i8)') startval
read (carg2,'(i8)') stopval
read (carg3,'(i8)') stepval
 
do iter = startval, stopval, stepval
 

NDIM = iter

allocate ( veca(NDIM), stat=ierr)
allocate ( vecb(NDIM), stat=ierr)


do i = 1, NDIM
     veca(i) = sqrt(dble(NDIM))
     vecb(i) = 1.0 / sqrt( dble(NDIM))
enddo

wall_start = walltime()
cpu_start = cputime()

trace = dot(1, NDIM, veca, vecb)

cpu_end = cputime()
wall_end = walltime()

mflops  = 2*dble(NDIM)/ (cpu_end-cpu_start) / 1.0e6
mflops2 = 2*dble(NDIM)/ (wall_end-wall_start)/ 1.0e6
 
print *, NDIM, trace, cpu_end-cpu_start, wall_end-wall_start,  mflops,
mflops2


deallocate(veca)
deallocate(vecb)

enddo


end program dotdriver
 

The dot product is SO FAST that it will be difficult to get reasonable
results unless you use extremely long vectors.  I ran on hammer with
vector lengths of 1 billion and it completed in milliseconds. This led
to performance in the gigaflops range, which is what we expect.

Be careful just going to extremely large integers for you dimensions --
remember, these are discrete machines and there is a limit on the size
of the integer values.  Normal ints, according to /usr/include/limits.h,
have a maximum of 2147483647.

You can go larger -- but you will have to change the integer variable types.


As always, let me know if you have any questions.



-- 
Andrew J. Pounds, Ph.D.  (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Director of the Computational Science Program
Mercer University,  Macon, GA 31207   (478) 301-5627

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc435/attachments/20200318/21382ca2/attachment.html>


More information about the csc435 mailing list