[CSC 204] Cleaning up words...

Andrew J. Pounds pounds_aj at mercer.edu
Fri Nov 21 18:50:00 EST 2014


You all looked troubled today when I started talking about having to 
clean up the words so that there were no punctuations symbols, etc.   
Here is a an example of a simple static method, although not exhaustive, 
that you could use in your program.  It basically converts the 
punctuation character to a space and then trims it.


private static String cleanUp( String word ){
             if (word.endsWith("'s")) word = 
word.substring(0,word.length()-2);
             word = word.replace('\'', ' ').trim();
             word = word.replace('[', ' ').trim();
             word = word.replace(']', ' ').trim();
             word = word.replace('.', ' ').trim();
             word = word.replace(',', ' ').trim();
             word = word.replace(';', ' ').trim();
             word = word.replace(':', ' ').trim();
             word = word.replace('!', ' ').trim();
             word = word.replace('?', ' ').trim();
             word = word.replace('(', ' ').trim();
             word = word.replace(')', ' ').trim();
             word = word.replace('"', ' ').trim();
             word = word.replace("--", "  ").trim();
             word = word.replace("`", " ").trim();

             return word;
         }

-- 
Andrew J. Pounds, Ph.D.  (pounds_aj at mercer.edu)
Professor of Chemistry and Computer Science
Mercer University,  Macon, GA 31207   (478) 301-5627
http://faculty.mercer.edu/pounds_aj

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://theochem.mercer.edu/pipermail/csc204/attachments/20141121/c860ad81/attachment.html>


More information about the csc204 mailing list