The Java Developers Almanac 1.4


Order this book from Amazon.

   
Home > List of Packages > java.io  [35 examples] > Parsing  [1 examples]

e47. Tokenizing Java Source Code

The StreamTokenizer can be used for simple parsing of a Java source file into tokens. The tokenizer can be aware of Java-style comments and ignore them. It is also aware of Java quoting and escaping rules.
    try {
        // Create the tokenizer to read from a file
        FileReader rd = new FileReader("filename.java");
        StreamTokenizer st = new StreamTokenizer(rd);
    
        // Prepare the tokenizer for Java-style tokenizing rules
        st.parseNumbers();
        st.wordChars('_', '_');
        st.eolIsSignificant(true);
    
        // If whitespace is not to be discarded, make this call
        st.ordinaryChars(0, ' ');
    
        // These calls caused comments to be discarded
        st.slashSlashComments(true);
        st.slashStarComments(true);
    
        // Parse the file
        int token = st.nextToken();
        while (token != StreamTokenizer.TT_EOF) {
            token = st.nextToken();
            switch (token) {
            case StreamTokenizer.TT_NUMBER:
                // A number was found; the value is in nval
                double num = st.nval;
                break;
            case StreamTokenizer.TT_WORD:
                // A word was found; the value is in sval
                String word = st.sval;
                break;
            case '"':
                // A double-quoted string was found; sval contains the contents
                String dquoteVal = st.sval;
                break;
            case '\'':
                // A single-quoted string was found; sval contains the contents
                String squoteVal = st.sval;
                break;
            case StreamTokenizer.TT_EOL:
                // End of line character found
                break;
            case StreamTokenizer.TT_EOF:
                // End of file has been reached
                break;
            default:
                // A regular character was found; the value is the token itself
                char ch = (char)st.ttype;
                break;
            }
        }
        rd.close();
    } catch (IOException e) {
    }

 Related Examples

See also: Directories    Encodings    Filenames and Pathnames    Files    Reading and Writing    Serialization   


© 2002 Addison-Wesley.