News:

As usual while waiting for the next release - don't forget to check the nightly builds in the forum.

Main Menu

bitwise operation on enum types

Started by ollydbg, November 15, 2013, 03:59:26 PM

Previous topic - Next topic

ollydbg

We have enum type definition

enum TokenKind
{
    // changed in order to reflect the priority
    tkNamespace     = 0x0001,
    tkClass         = 0x0002,
    tkEnum          = 0x0004,
    tkTypedef       = 0x0008, // typedefs are stored as classes inheriting from the typedef'd type (taking advantage of existing inheritance code)
    tkConstructor   = 0x0010,
    tkDestructor    = 0x0020,
    tkFunction      = 0x0040,
    tkVariable      = 0x0080,
    tkEnumerator    = 0x0100,
    tkPreprocessor  = 0x0200,
    tkMacro         = 0x0400,

    // convenient masks
    tkAnyContainer  = tkClass    | tkNamespace   | tkTypedef,
    tkAnyFunction   = tkFunction | tkConstructor | tkDestructor,

    // undefined or just "all"
    tkUndefined     = 0xFFFF
};


But in some cases, we need a mask, like:

Token* TokenExists(const wxString& name, const Token* parent = 0, short int kindMask = 0xFFFF);


Note, here the mask type is: short int, TokenKind is not allowed here, because if you put here, then some code like:

TokenExists(, , tkTypedef | tkClass);

will compiler error, like: error: invalid conversion from 'int' to enum type TokenKind.

There are many discussion:
on enum and bitwise operation
How to use enums as flags in C++?
and more.

So, what's your opinion on this, I think using a TokenKind for the function argument is better.
Do we use:
TokenExists(, , static_cast<TokenKind>(tkTypedef | tkClass));

Or, we can define some bitwise operator like in this post http://stackoverflow.com/a/1448478/154911

enum AnimalFlags
{
    HasClaws = 1,
    CanFly =2,
    EatsFish = 4,
    Endangered = 8
};

inline AnimalFlags operator|(AnimalFlags a, AnimalFlags b)
{return static_cast<AnimalFlags>(static_cast<int>(a) | static_cast<int>(b));}
...


If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

ollydbg

I see some code in CC use type conversion, like:
            else if (command == cmdSearchAll)
                tree->FindMatches(args, result, true, false, TokenKind(kindToSearch));
            else
                tree->FindMatches(args, result, true, false, TokenKind(tkAnyContainer|tkEnum));

The function prototype is:
   size_t FindMatches(const wxString& query, TokenIdxSet& result, bool caseSensitive, bool is_prefix, TokenKind kindMask = tkUndefined);
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

thomas

It is strictly legal for an enumeration value to be a value that is not part of the enumeration definition, as long as it fits the storage size. Therefore, I would just add either an operator| or a conversion operator. Preferrably the former, as it is more type-safe (a conversion operator would totally anihilate the type system, since it converts any integer, not just one composed of enum values);

Something like TokenKind operator|(TokenKind a, TokenKind b) { return (TokenKind) (a|b); } should work.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

oBFusCATed

Quote from: thomas on November 15, 2013, 05:01:12 PM
It is strictly legal for an enumeration value to be a value that is not part of the enumeration definition, as long as it fits the storage size.
Can you quote the standard?

@ollydbg:
I don't see where the problem with using short int parameter is coming from?

This compiles without errors on GCC 4.4

enum TokenKind
{
    // changed in order to reflect the priority
    tkNamespace     = 0x0001,
    tkClass         = 0x0002,
    tkEnum          = 0x0004,
    tkTypedef       = 0x0008, // typedefs are stored as classes inheriting from the typedef'd type (taking advantage of existing inheritance code)
    tkConstructor   = 0x0010,
    tkDestructor    = 0x0020,
    tkFunction      = 0x0040,
    tkVariable      = 0x0080,
    tkEnumerator    = 0x0100,
    tkPreprocessor  = 0x0200,
    tkMacro         = 0x0400,

    // convenient masks
    tkAnyContainer  = tkClass    | tkNamespace   | tkTypedef,
    tkAnyFunction   = tkFunction | tkConstructor | tkDestructor,

    // undefined or just "all"
    tkUndefined     = 0xFFFF
};

void func(int a, short int b=tkUndefined) {
int c;
c=a+b;
}
int main() {
func(1, tkTypedef | tkClass);
return 0;
}
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

ollydbg

@OBF
Yes, the code builds OK.
I don't know "short int" has the same sizeof TokenKind.
Currently, the largest enumerator in TokenKind is 0xFFFF (16bit), what about "short int"? People will confused, but when we put TokenKind in the function parameter, they don't worry about different types.
If some piece of memory should be reused, turn them to variables (or const variables).
If some piece of operations should be reused, turn them to functions.
If they happened together, then turn them to classes.

oBFusCATed

Quote from: ollydbg on November 17, 2013, 06:49:36 AMThey don't worry about different types.
I worry because as far as I know an enumerator type must be assigned only with its values. If you do a cast to assign then it is the same as int.
you can test the size of TokenKind with sizeof. I suppose it will give you 4 bytes, because it is either int or unsigned int.
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

thomas

Quote from: oBFusCATed on November 15, 2013, 05:46:13 PM
Quote from: thomas on November 15, 2013, 05:01:12 PM
It is strictly legal for an enumeration value to be a value that is not part of the enumeration definition, as long as it fits the storage size.
Can you quote the standard?
I could, but I won't. That would be 15 or 20 minutes wasted on searching, which is kind of pointless.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

thomas

There you go, second last sentence of §7.2 par 8:
QuoteIt is possible to define an enumeration that has values not defined by any of its enumerators.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."

oBFusCATed

Hm, if I understand §7.2 par 8 correctly, then you're wrong, because this paragraph explicitly forbids int to enum conversions.

Here is the example taken from the standard:

enum color { red, yellow, green=20, blue };
color col=red; // ok
color c=1; // error - type mismatch, no conversion from int to color
int i=yellow; //ok, yellow converted to integral value 1
(most of the time I ignore long posts)
[strangers don't send me private messages, I'll ignore them; post a topic in the forum, but first read the rules!]

thomas

#9
That's an implicit conversion. It's illegal, and for a good reason (because it is almost certainly a programming error).
C++11 type-safe enumerations even go one step further and also forbid implicit conversions in the other direction.

The standard also states (one or two paragraphs earlier, I think) that the values of an enumeration are all values in the range between the smallest and largest value defined. In other words, in enum{a, b=500}; any value between 0 and 500 inclusive is well-defined. I believe to remember a wording like "if representable by the underlying storage size" too, although I can't provide a reference for that out of memory.

The wording on the min/max range, if one is pedantic, doesn't include bitwise-or of any enum values (it does include bitwise-or of any but the biggest-log2 values though, or in the case of a bit-flag type of enum, any operation not including the largest value).
That's obvious, since biggest|some_other_value >= biggest. But if one is pedantic, it also doesn't say the opposite, the wording is "the values are", "not all legitimate values are".

Though it explicitly allows almost the exact case you're after, and if the "almost" bit really bothers you, you can always add a bigger value to each enumeration that is twice the value of the otherwise biggest value. Then there is no way someone could claim, even theoretically or in a contrieved case, that this isn't explicitly allowed.
"We should forget about small efficiencies, say about 97% of the time: Premature quotation is the root of public humiliation."