<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix"><br>

      <br>

      On 11/30/14 17:28,  wrote:<br>

    </div>

    <blockquote

cite="mid:C40B2F181831EF44A88CD73525827803130CFB7EF9@MERCERMAIL.MercerU.local"

      type="cite">

      <div style="color:#000; background-color:#fff;

        font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial,

        Lucida Grande, sans-serif;font-size:16px">

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Just to be

          sure of everything we're supposed to do, we're supposed to

          get:</div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr"><br>

        </div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">100 most

          common non-special words, alphabetized</div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Number of

          times those words occur</div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Connective

          word percentage of the special words only</div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Word

          distribution of all words excluding proper nouns</div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr"><br>

        </div>

        <div id="yui_3_16_0_1_1417385155597_6555" dir="ltr">Is that

          everything? </div>

      </div>

    </blockquote>

    <br>

    That sounds about right.  I have mine formatted like this....<br>

    <br>

    <tt>*** DOCUMENT ANALYZER ***</tt><tt><br>

    </tt><tt><br>

    </tt><tt>25 lines of text processed.</tt><tt><br>

    </tt><tt><br>

    </tt><tt>The 100 most used words</tt><tt><br>

    </tt><tt>                Word     Occurrences     Word Distribution

      Index</tt><tt><br>

    </tt><tt>                 are            3          18.67</tt><tt><br>

    </tt><tt>                  be            2          72.00</tt><tt><br>

    </tt><tt>                 can            5          47.20</tt><br>

    .<br>

    .<br>

    .<br>

    <br>

    <tt>The connecting word index is 16.786</tt><tt><br>

    </tt><br>

    <br>

    Of course, for the KJV.txt file the words and numbers are very

    different.  Using the internet connection at my house and a six year

    old computer, the KJV file is processed in just under 12 seconds

    using my code.  Don't stress over the time -- that number is simply

    meant to let you know that it should not take FOREVER.  <br>

    <br>

    <br>

    Now, one thing that may differ from your code to mine is the actual

    connecting word index.   The differences I have seen in this in the

    past had to do with people including cardinal numbers in their lists

    of words.  I do not include cardinal numbers in my my list of words

    because, especially in the case of Biblical translations, the verse

    numbers could really throw off the word count.<br>

      <br>

    <br>

    <pre class="moz-signature" cols="72">-- 

Andrew J. Pounds, Ph.D.  (<a class="moz-txt-link-abbreviated" href="mailto:pounds_aj@mercer.edu">pounds_aj@mercer.edu</a>)

Professor of Chemistry and Computer Science

Mercer University,  Macon, GA 31207   (478) 301-5627

<a class="moz-txt-link-freetext" href="http://faculty.mercer.edu/pounds_aj">http://faculty.mercer.edu/pounds_aj</a>

</pre>

  </body>

</html>