<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>So I wrote a little perl program last night to run through and
test ALL of the optimization options for the gcc 6.4.0 compiler
suite. In the following output the first line is the result
(dimension, time, and the two values that we check for
consistency) using optimization flag -O0. Then I list the option
I turned on in the makefile, and the results from running the code
with that optimization option only. Some things help a little,
some things do not. At least this way you can see what options
might help you. I did not try combinations of options.</p>
<p>I ran these on a "quiet" machine so there was little competition
for processor time with others. There are only a few options
that, by themselves, help. You may want to look for combinations
of options to get better speedup.<br>
</p>
<p><br>
</p>
<p><font face="Liberation Mono">dock:/tmp/pounds/condusty %
./testoptions.pl | tee dump | pace <br>
rm: cannot remove `*.o': No such file or directory<br>
make: *** [clean] Error 1<br>
100 61.750000000000000
386743.81287443032 0.81989322403772058 <br>
-fauto-inc-dec <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fbranch-count-reg <br>
100 61.930000000000000
386743.81287443032 0.81989322403772058 <br>
-fcombine-stack-adjustments <br>
100 61.729999999999997
386743.81287443032 0.81989322403772058 <br>
-fcompare-elim <br>
100 61.840000000000003
386743.81287443032 0.81989322403772058 <br>
-fcprop-registers <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fdce <br>
100 61.859999999999999
386743.81287443032 0.81989322403772058 <br>
-fdefer-pop <br>
f951: Warning: this target machine does not have
delayed branches<br>
100 61.82000f951: Warning: this target machine does not have
delayed branches<br>
0000000000 cc1: warning: this target machine does not have
delayed branches<br>
386743.81287443032 0.81989322403772058 <br>
-fdelayed-branch <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fdse <br>
100 61.890000000000001
386743.81287443032 0.81989322403772058 <br>
-fforward-propagate <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-fguess-branch-probability <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-fif-conversion2 <br>
100 61.840000000000003
386743.81287443032 0.81989322403772058 <br>
-fif-conversion <br>
100 61.920000000000002
386743.81287443032 0.81989322403772058 <br>
-finline-functions-called-once <br>
100 61.789999999999999
386743.81287443032 0.81989322403772058 <br>
-fipa-pure-const <br>
100 61.920000000000002
386743.81287443032 0.81989322403772058 <br>
-fipa-profile <br>
100 61.890000000000001
386743.81287443032 0.81989322403772058 <br>
-fipa-reference <br>
100 61.600000000000001
386743.81287443032 0.81989322403772058 <br>
-fmerge-constants <br>
100 61.770000000000003
386743.81287443032 0.81989322403772058 <br>
-fmove-loop-invariants <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-freorder-blocks <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-fshrink-wrap <br>
100 61.579999999999998
386743.81287443032 0.81989322403772058 <br>
-fsplit-wide-types <br>
100 61.820000000000000
386743.81287443032 0.81989322403772058 <br>
-fssa-backprop <br>
100 62.020000000000003
386743.81287443032 0.81989322403772058 <br>
-fssa-phiopt <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-ftree-bit-ccp <br>
100 61.880000000000003
386743.81287443032 0.81989322403772058 <br>
-ftree-ccp <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-ftree-ch <br>
100 61.820000000000000
386743.81287443032 0.81989322403772058 <br>
-ftree-coalesce-vars <br>
100 61.469999999999999
386743.81287443032 0.81989322403772058 <br>
-ftree-copy-prop <br>
100 61.740000000000002
386743.81287443032 0.81989322403772058 <br>
-ftree-dce <br>
100 61.880000000000003
386743.81287443032 0.81989322403772058 <br>
-ftree-dominator-opts <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-ftree-dse <br>
100 61.890000000000001
386743.81287443032 0.81989322403772058 <br>
-ftree-forwprop <br>
100 61.759999999999998
386743.81287443032 0.81989322403772058 <br>
-ftree-fre <br>
100 61.930000000000000
386743.81287443032 0.81989322403772058 <br>
-ftree-phiprop <br>
100 61.820000000000000
386743.81287443032 0.81989322403772058 <br>
-ftree-sink <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-ftree-slsr <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-ftree-sra <br>
100 61.770000000000003
386743.81287443032 0.81989322403772058 <br>
-ftree-pta <br>
100 61.890000000000001
386743.81287443032 0.81989322403772058 <br>
-ftree-ter <br>
100 62.880000000000003
386743.81287443032 0.81989322403772058 <br>
-funit-at-a-time <br>
100 61.840000000000003
386743.81287443032 0.81989322403772058 <br>
-fthread-jumps <br>
100 61.770000000000003
386743.81287443032 0.81989322403772058 <br>
-falign-functions <br>
100 61.920000000000002
386743.81287443032 0.81989322403772058 <br>
-falign-jumps <br>
100 61.859999999999999
386743.81287443032 0.81989322403772058 <br>
-falign-loops <br>
100 61.880000000000003
386743.81287443032 0.81989322403772058 <br>
-falign-labels <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-fcaller-saves <br>
100 61.960000000000001
386743.81287443032 0.81989322403772058 <br>
-fcrossjumping <br>
100 61.829999999999998
386743.81287443032 0.81989322403772058 <br>
-fcse-follow-jumps <br>
100 61.740000000000002
386743.81287443032 0.81989322403772058 <br>
-fcse-skip-blocks <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-fdelete-null-pointer-checks <br>
100 62.140000000000001
386743.81287443032 0.81989322403772058 <br>
-fdevirtualize <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fdevirtualize-speculatively <br>
100 61.710000000000001
386743.81287443032 0.81989322403772058 <br>
-fexpensive-optimizations <br>
100 61.469999999999999
386743.81287443032 0.81989322403772058 <br>
-fgcse <br>
100 61.759999999999998
386743.81287443032 0.81989322403772058 <br>
-fgcse-lm <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-fhoist-adjacent-loads <br>
100 61.930000000000000
386743.81287443032 0.81989322403772058 <br>
-finline-small-functions <br>
100 61.770000000000003
386743.81287443032 0.81989322403772058 <br>
-findirect-inlining <br>
100 61.759999999999998
386743.81287443032 0.81989322403772058 <br>
-fipa-cp <br>
100 61.969999999999999
386743.81287443032 0.81989322403772058 <br>
-fipa-cp-alignment <br>
100 61.810000000000002
386743.81287443032 0.81989322403772058 <br>
-fipa-sra <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-fipa-icf <br>
100 61.920000000000002
386743.81287443032 0.81989322403772058 <br>
-fisolate-erroneous-paths-dereference <br>
100 61.840000000000003
386743.81287443032 0.81989322403772058 <br>
-flra-remat <br>
100 61.780000000000001
386743.81287443032 0.81989322403772058 <br>
-foptimize-sibling-calls <br>
100 61.880000000000003
386743.81287443032 0.81989322403772058 <br>
-foptimize-strlen <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fpartial-inlining <br>
100 61.590000000000003
386743.81287443032 0.81989322403772058 <br>
-fpeephole2 <br>
100 61.759999999999998
386743.81287443032 0.81989322403772058 <br>
-freorder-blocks-algorithm=stc <br>
100 61.950000000000003
386743.81287443032 0.81989322403772058 <br>
-freorder-blocks-and-partition <br>
100 61.799999999999997
386743.81287443032 0.81989322403772058 <br>
-freorder-functions <br>
100 61.890000000000001
386743.81287443032 0.81989322403772058 <br>
-frerun-cse-after-loop <br>
100 61.950000000000003
386743.81287443032 0.81989322403772058 <br>
-fsched-interblock <br>
100 61.770000000000003
386743.81287443032 0.81989322403772058 <br>
-fsched-spec <br>
100 61.899999999999999
386743.81287443032 0.81989322403772058 <br>
-fschedule-insns <br>
100 61.829999999999998
386743.81287443032 0.81989322403772058 <br>
-fschedule-insns2 <br>
100 61.909999999999997
386743.81287443032 0.81989322403772058 <br>
-fstrict-aliasing <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-fstrict-overflow <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-ftree-builtin-call-dce <br>
100 61.810000000000002
386743.81287443032 0.81989322403772058 <br>
-ftree-switch-conversion <br>
100 61.770000000000003
386743.81287443032 0.81989322403772058 <br>
-ftree-tail-merge <br>
100 61.750000000000000
386743.81287443032 0.81989322403772058 <br>
-ftree-pre <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-ftree-vrp <br>
100 61.820000000000000
386743.81287443032 0.81989322403772058 <br>
-fipa-ra <br>
100 61.930000000000000
386743.81287443032 0.81989322403772058 <br>
-finline-functions <br>
100 61.920000000000002
386743.81287443032 0.81989322403772058 <br>
-funswitch-loops <br>
100 61.789999999999999
386743.81287443032 0.81989322403772058 <br>
-fpredictive-commoning <br>
100 61.909999999999997
386743.81287443032 0.81989322403772058 <br>
-fgcse-after-reload <br>
100 61.850000000000001
386743.81287443032 0.81989322403772058 <br>
-ftree-loop-vectorize <br>
100 61.859999999999999
386743.81287443032 0.81989322403772058 <br>
-ftree-loop-distribute-patterns <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fsplit-paths <br>
100 61.750000000000000
386743.81287443032 0.81989322403772058 <br>
-ftree-slp-vectorize <br>
100 61.869999999999997
386743.81287443032 0.81989322403772058 <br>
-fvect-cost-model <br>
100 61.780000000000001
386743.81287443032 0.81989322403772058 <br>
-ftree-partial-pre <br>
100 61.789999999999999
386743.81287443032 0.81989322403772058 <br>
-fipa-cp-clone <br>
100 61.920000000000002
386743.81287443032 0.81989322403772058 <br>
dock:/tmp/pounds/condusty % <br>
</font><br>
</p>
<pre class="moz-signature" cols="72">--
Andrew J. Pounds, Ph.D. (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)
Professor of Chemistry and Computer Science
Mercer University, Macon, GA 31207 (478) 301-5627
<a class="moz-txt-link-freetext" href="http://faculty.mercer.edu/pounds_aj">http://faculty.mercer.edu/pounds_aj</a>
</pre>
</body>
</html>