[CSC 204] Everything

Andrew J. Pounds pounds_aj at mercer.edu
Sun Nov 30 18:26:24 EST 2014



On 11/30/14 17:28,  wrote:
> Just to be sure of everything we're supposed to do, we're supposed to get:
>
> 100 most common non-special words, alphabetized
> Number of times those words occur
> Connective word percentage of the special words only
> Word distribution of all words excluding proper nouns
>
> Is that everything?

That sounds about right.  I have mine formatted like this....

*** DOCUMENT ANALYZER ***

25 lines of text processed.

The 100 most used words
                 Word     Occurrences     Word Distribution Index
                  are            3          18.67
                   be            2          72.00
                  can            5          47.20
.
.
.

The connecting word index is 16.786


Of course, for the KJV.txt file the words and numbers are very 
different.  Using the internet connection at my house and a six year old 
computer, the KJV file is processed in just under 12 seconds using my 
code.  Don't stress over the time -- that number is simply meant to let 
you know that it should not take FOREVER.


Now, one thing that may differ from your code to mine is the actual 
connecting word index.   The differences I have seen in this in the past 
had to do with people including cardinal numbers in their lists of 
words.  I do not include cardinal numbers in my my list of words 
because, especially in the case of Biblical translations, the verse 
numbers could really throw off the word count.


-- 
Andrew J. Pounds, Ph.D.  (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Mercer University,  Macon, GA 31207   (478) 301-5627
http://faculty.mercer.edu/pounds_aj

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc204/attachments/20141130/3aa289d9/attachment.html>


More information about the csc204 mailing list