Hi. I have to be able to input any number of strings from a text file which I import, pick out all the words from a file, and convert them to lowercase.
Now I think I can to the converting to lowercase using the String method toLowerCase(). I can also import the file no problems, my problem arises trying to only select words, (ie if they are integers etc, they should not be included, but the hardest part is that I have to be able to distinguish a word with a "," directly after it, from one in the middle of the word.)
Eg
the sunny day. //This would be 3 words
the su.nny day //This would only be 2.
I cant figure out the code or logic behind the example above. My code so far is as follows:
Code:
import java.io.*;
import java.util.*;
public class Example
{
public static void main() throws IOException
{
//set up input stream
BufferedReader filein =
new BufferedReader(new FileReader("C:\\Program Files\\farrago.txt"));
//set up output stream
PrintWriter fileout =
new PrintWriter(new FileWriter("C:\\Program Files\\fileout.txt"));
String nextLine; //a line read from file
StringTokenizer t; //the words within the line
ArrayList a = new ArrayList(); //store all words here
String s; //next word
String letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWX YZ";
boolean isAWord; //true if string represents a Word
int countGood=0, countBad=0;
char d; //a digit
System.out.println("\nGo through text\n");
//start
while(true) //potentially infinite loop
{
nextLine = filein.readLine(); //read from input file
if (nextLine == null) break; //if no more lines, get out
t = new StringTokenizer(nextLine); //identify words
while (t.hasMoreTokens())
{
a.add(t.nextToken()); //add next word to list
}
} //finished reading file
//go through list
for (int i=0; i<a.size(); i++)
{
s = (String)a.get(i); //get next word
isAWord = true; //assume it represents a Word
d = s.charAt(0); //first character
if (letters.indexOf(d) == -1 && d != ',' && d != '.' && d != ';' && d != ':' && d != '!' && d != '?')
{
isAWord = false; //NOT SURE IF THIS LOOP IS CORRECT?
}
else
{
for (int j=1; j<s.length(); j++)
{
d = s.charAt(j); //j+1st character
if (letters.indexOf(d) == -1)
{
isAWord = false; //it is not a Word
break; //no need to look further
}
}
}
if (isAWord)
{
fileout.println(s);
countGood++;
}
else countBad++;
} //all words processed
System.out.println("\nA total of " + a.size() + " tokens appeared\n");
System.out.println("\nFinished.\n");
}
}
Any help really would be appreciated, as my book and lecture notes cant seem to help me. If anyone does help, could they try and explain it too please?
Thanks
James