News:

The new Release 25.03 is out! You can download binaries for Windows and many major Linux distros here .

Main Menu

Tokenizer should return a token type?

Started by ollydbg, April 13, 2010, 04:40:37 PM

Previous topic - Next topic

ollydbg

Here, I found several months ago, I have write this suggestion Re: New code completion remarks/issues

Today, I found more evidence, see here:

http://www.macs.hw.ac.uk/~alison/alg/lectures/l7.pdf

Here is the test in this pdf:
QuoteThe tokeniser should extract both the text
(lexeme) and the class of the item. So, a token
for "35" should contain the text (35) and the
type NUM.

also
QuoteSo.. a suitable datatype for a token will be a
struct or class such as the following:
struct token {
char* text;
tokentype type;
};
where tokentype is an enumerated type
specifying possible types of token.

So, a token type is necessary. :D


By the way, this site:
The Mini C++ Interpreter give a simple example of a C++ parser. There is a chapter describing this mini C++ interpreter in the book "The art of C++".
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.