<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>So I wrote a little perl program last night to run through and
      test ALL of the optimization options for the gcc 6.4.0 compiler
      suite.  In the following output the first line is the result
      (dimension, time, and the two values that we check for
      consistency) using optimization flag -O0.  Then I list the option
      I turned on in the makefile, and the results from running the code
      with that optimization option only.  Some things help a little,
      some things do not.  At least this way you can see what options
      might help you.  I did not try combinations of options.</p>
    <p>I ran these on a "quiet" machine so there was little competition
      for processor time with others.  There are only a few options
      that, by themselves, help.  You may want to look for combinations
      of options to get better speedup.<br>
    </p>
    <p><br>
    </p>
    <p><font face="Liberation Mono">dock:/tmp/pounds/condusty %
        ./testoptions.pl | tee dump | pace <br>
        rm: cannot remove `*.o': No such file or directory<br>
        make: *** [clean] Error 1<br>
                 100   61.750000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -fauto-inc-dec <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fbranch-count-reg <br>
                 100   61.930000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -fcombine-stack-adjustments <br>
                 100   61.729999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fcompare-elim <br>
                 100   61.840000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fcprop-registers <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fdce <br>
                 100   61.859999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fdefer-pop <br>
                 f951: Warning: this target machine does not have
        delayed branches<br>
        100   61.82000f951: Warning: this target machine does not have
        delayed branches<br>
        0000000000   cc1: warning: this target machine does not have
        delayed branches<br>
             386743.81287443032       0.81989322403772058     <br>
        -fdelayed-branch <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fdse <br>
                 100   61.890000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fforward-propagate <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fguess-branch-probability <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fif-conversion2 <br>
                 100   61.840000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fif-conversion <br>
                 100   61.920000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -finline-functions-called-once <br>
                 100   61.789999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-pure-const <br>
                 100   61.920000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-profile <br>
                 100   61.890000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-reference <br>
                 100   61.600000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fmerge-constants <br>
                 100   61.770000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fmove-loop-invariants <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -freorder-blocks <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fshrink-wrap <br>
                 100   61.579999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -fsplit-wide-types <br>
                 100   61.820000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -fssa-backprop <br>
                 100   62.020000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fssa-phiopt <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-bit-ccp <br>
                 100   61.880000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-ccp <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-ch <br>
                 100   61.820000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-coalesce-vars <br>
                 100   61.469999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-copy-prop <br>
                 100   61.740000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-dce <br>
                 100   61.880000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-dominator-opts <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-dse <br>
                 100   61.890000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-forwprop <br>
                 100   61.759999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-fre <br>
                 100   61.930000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-phiprop <br>
                 100   61.820000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-sink <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-slsr <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-sra <br>
                 100   61.770000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-pta <br>
                 100   61.890000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-ter <br>
                 100   62.880000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -funit-at-a-time <br>
                 100   61.840000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fthread-jumps <br>
                 100   61.770000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -falign-functions  <br>
                 100   61.920000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -falign-jumps <br>
                 100   61.859999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -falign-loops  <br>
                 100   61.880000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -falign-labels <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fcaller-saves <br>
                 100   61.960000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fcrossjumping <br>
                 100   61.829999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -fcse-follow-jumps  <br>
                 100   61.740000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -fcse-skip-blocks <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fdelete-null-pointer-checks <br>
                 100   62.140000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fdevirtualize <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fdevirtualize-speculatively <br>
                 100   61.710000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fexpensive-optimizations <br>
                 100   61.469999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fgcse <br>
                 100   61.759999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -fgcse-lm <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fhoist-adjacent-loads <br>
                 100   61.930000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -finline-small-functions <br>
                 100   61.770000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -findirect-inlining <br>
                 100   61.759999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-cp <br>
                 100   61.969999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-cp-alignment <br>
                 100   61.810000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-sra <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-icf <br>
                 100   61.920000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -fisolate-erroneous-paths-dereference <br>
                 100   61.840000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -flra-remat <br>
                 100   61.780000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -foptimize-sibling-calls <br>
                 100   61.880000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -foptimize-strlen <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fpartial-inlining <br>
                 100   61.590000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fpeephole2 <br>
                 100   61.759999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -freorder-blocks-algorithm=stc <br>
                 100   61.950000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -freorder-blocks-and-partition <br>
                 100   61.799999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -freorder-functions <br>
                 100   61.890000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -frerun-cse-after-loop <br>
                 100   61.950000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fsched-interblock  <br>
                 100   61.770000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -fsched-spec <br>
                 100   61.899999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fschedule-insns  <br>
                 100   61.829999999999998       
        386743.81287443032       0.81989322403772058     <br>
        -fschedule-insns2 <br>
                 100   61.909999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fstrict-aliasing <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -fstrict-overflow <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-builtin-call-dce <br>
                 100   61.810000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-switch-conversion <br>
                 100   61.770000000000003       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-tail-merge <br>
                 100   61.750000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-pre <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-vrp <br>
                 100   61.820000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-ra <br>
                 100   61.930000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -finline-functions <br>
                 100   61.920000000000002       
        386743.81287443032       0.81989322403772058     <br>
        -funswitch-loops <br>
                 100   61.789999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fpredictive-commoning <br>
                 100   61.909999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fgcse-after-reload <br>
                 100   61.850000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-loop-vectorize <br>
                 100   61.859999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-loop-distribute-patterns <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fsplit-paths <br>
                 100   61.750000000000000       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-slp-vectorize <br>
                 100   61.869999999999997       
        386743.81287443032       0.81989322403772058     <br>
        -fvect-cost-model <br>
                 100   61.780000000000001       
        386743.81287443032       0.81989322403772058     <br>
        -ftree-partial-pre <br>
                 100   61.789999999999999       
        386743.81287443032       0.81989322403772058     <br>
        -fipa-cp-clone <br>
                 100   61.920000000000002       
        386743.81287443032       0.81989322403772058     <br>
        dock:/tmp/pounds/condusty % <br>
      </font><br>
    </p>
    <pre class="moz-signature" cols="72">-- 
Andrew J. Pounds, Ph.D.  (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)
Professor of Chemistry and Computer Science
Mercer University,  Macon, GA 31207   (478) 301-5627
<a class="moz-txt-link-freetext" href="http://faculty.mercer.edu/pounds_aj">http://faculty.mercer.edu/pounds_aj</a>
</pre>
  </body>
</html>