<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
<br>
On 11/30/14 17:28, wrote:<br>
</div>
<blockquote
cite="mid:C40B2F181831EF44A88CD73525827803130CFB7EF9@MERCERMAIL.MercerU.local"
type="cite">
<div style="color:#000; background-color:#fff;
font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial,
Lucida Grande, sans-serif;font-size:16px">
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Just to be
sure of everything we're supposed to do, we're supposed to
get:</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr"><br>
</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">100 most
common non-special words, alphabetized</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Number of
times those words occur</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Connective
word percentage of the special words only</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Word
distribution of all words excluding proper nouns</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr"><br>
</div>
<div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Is that
everything? </div>
</div>
</blockquote>
<br>
That sounds about right. I have mine formatted like this....<br>
<br>
<tt>*** DOCUMENT ANALYZER ***</tt><tt><br>
</tt><tt><br>
</tt><tt>25 lines of text processed.</tt><tt><br>
</tt><tt><br>
</tt><tt>The 100 most used words</tt><tt><br>
</tt><tt> Word Occurrences Word Distribution
Index</tt><tt><br>
</tt><tt> are 3 18.67</tt><tt><br>
</tt><tt> be 2 72.00</tt><tt><br>
</tt><tt> can 5 47.20</tt><br>
.<br>
.<br>
.<br>
<br>
<tt>The connecting word index is 16.786</tt><tt><br>
</tt><br>
<br>
Of course, for the KJV.txt file the words and numbers are very
different. Using the internet connection at my house and a six year
old computer, the KJV file is processed in just under 12 seconds
using my code. Don't stress over the time -- that number is simply
meant to let you know that it should not take FOREVER. <br>
<br>
<br>
Now, one thing that may differ from your code to mine is the actual
connecting word index. The differences I have seen in this in the
past had to do with people including cardinal numbers in their lists
of words. I do not include cardinal numbers in my my list of words
because, especially in the case of Biblical translations, the verse
numbers could really throw off the word count.<br>
<br>
<br>
<pre class="moz-signature" cols="72">--
Andrew J. Pounds, Ph.D. (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)
Professor of Chemistry and Computer Science
Mercer University, Macon, GA 31207 (478) 301-5627
<a class="moz-txt-link-freetext" href="http://faculty.mercer.edu/pounds_aj">http://faculty.mercer.edu/pounds_aj</a>
</pre>
</body>
</html>