开发者

Is this a safe way to implement a generic operator== and operator<?

After seeing this question, my first thought was that it'd be trivial to define generic equivalence and relational operators:

#include <cstring>

template<class T>
bool operator==(const T& a, const T& b) {

    return std::memcmp(&a, &b, sizeof(T)) == 0;

}

template<class T>
bool operator<(const T& a, const T& b) {

    return std::memcmp(&a, &b, sizeof(T)) < 0;

}

using namespace std::rel_ops would then become even more useful, since it would be made fully generic by the default implementations of operators == and <. Obviously this does not perform a memberwise comparison, but instead a bitwise one, as though the type contains only POD members. This is not entirely consistent with how C++ generates copy constructors, for instance, which do perform memberwise copying.

But I wonder whether the above implementation is indeed safe. The structures would naturally have the same packing, being of the same type, but are the contents of the padding guaranteed to be identical (e.g., filled with zeros)? Are there any reasons why or situat开发者_开发百科ions in which this wouldn't work?


No -- just for example, if you have T==(float | double | long double), your operator== doesn't work right. Two NaNs should never compare as equal, even if they have the identical bit pattern (in fact, one common method of detecting a NaN is to compare the number to itself -- if it's not equal to itself, it's a NaN). Likewise, two floating point numbers with all the bits in their exponents set to 0 have the value 0.0 (exactly) regardless of what bits might be set/clear in the significand.

Your operator< has even less chance of working correctly. For example, consider a typical implementation of std::string that looks something like this:

template <class charT>
class string { 
    charT *data;
    size_t length;
    size_t buffer_size;
public:
    // ...
};

With this ordering of the members, your operator< will do its comparison based on the addresses of the buffers where the strings happen to have stored their data. If, for example, it happened to have been written with the length member first, your comparison would use the lengths of the strings as the primary keys. In any case, it won't do a comparison based on the actual string contents, because it will only ever look at the value of the data pointer, not whatever it points at, which is what you really want/need.

Edit: As far as padding goes, there's no requirement that the contents of padding be equal. It's also theoretically possible for padding to be some sort of trap representation that will cause a signal, throw an exception, or something on that order, if you even try to look at it at all. To avoid such trap representations, you need to use something like a cast to look at it as a buffer of unsigned chars. memcmp might do that, but then again it might not...

Also note that being the same types of objects does not necessarily mean the use the same alignment of members. That's a common method of implementation, but it's also entirely possible for a compiler to do something like using different alignments based on how often it "thinks" a particular object will be used, and include a tag of some sort in the object (e.g., a value written into the first padding byte) that tells the alignment for this particular instance. Likewise, it could segregate objects by (for example) address, so an object located at an even address has 2-byte alignment, at an address that's a multiple of four has 4-byte alignment, and so on (this can't be used for POD types, but otherwise, all bets are off).

Neither of these is likely or common, but offhand I can't think of anything in the standard that prohibits them either.


Never do this unless you're 100% sure about the memory layout, compiler behavior, and you really don't care portability, and you really want to gain the efficiency

SOURCE


Even for POD, == operator can be wrong. This is due to alignment of structures like the following one which takes 8 bytes on my compiler.

class Foo {
  char foo; /// three bytes between foo and bar
  int bar;
};


That's highly dangerous because the compiler will use these definitions not only for plain old structs, but also for any classes, however complex, for which you forgot to define == and < properly.

One day, it will bite you.


A lot can depend on your definition of equivalence.

e.g. if any of the members that you are comparing within your classes are floating point numbers.

The above implementation may treat two doubles as not equal even though they came from the same mathematical calculation with the same inputs - as they may not have generated exactly the same output - rather two very similar numbers.

Typically such numbers should be compared numerically with an appropriate tolerance.


Any struct or class containing a single pointer will instantly fail any sort of meaningful comparison. Those operators will ONLY work for any class that is Plain Old Data, or POD. Another answerer correctly pointed out floating points as a case when even that won't hold true, and padding bytes.

Short answer: If this was a smart idea, the language would have it like default copy constructors/assignment operators.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜