[CSC 204] Everything
Andrew J. Pounds
pounds_aj at mercer.edu
Sun Nov 30 18:26:24 EST 2014
On 11/30/14 17:28, wrote:
> Just to be sure of everything we're supposed to do, we're supposed to get:
>
> 100 most common non-special words, alphabetized
> Number of times those words occur
> Connective word percentage of the special words only
> Word distribution of all words excluding proper nouns
>
> Is that everything?
That sounds about right. I have mine formatted like this....
*** DOCUMENT ANALYZER ***
25 lines of text processed.
The 100 most used words
Word Occurrences Word Distribution Index
are 3 18.67
be 2 72.00
can 5 47.20
.
.
.
The connecting word index is 16.786
Of course, for the KJV.txt file the words and numbers are very
different. Using the internet connection at my house and a six year old
computer, the KJV file is processed in just under 12 seconds using my
code. Don't stress over the time -- that number is simply meant to let
you know that it should not take FOREVER.
Now, one thing that may differ from your code to mine is the actual
connecting word index. The differences I have seen in this in the past
had to do with people including cardinal numbers in their lists of
words. I do not include cardinal numbers in my my list of words
because, especially in the case of Biblical translations, the verse
numbers could really throw off the word count.
--
Andrew J. Pounds, Ph.D. (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Mercer University, Macon, GA 31207 (478) 301-5627
http://faculty.mercer.edu/pounds_aj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc204/attachments/20141130/3aa289d9/attachment.html>
More information about the csc204
mailing list