开发者

Why only integral enums?

I've been writing 开发者_运维技巧C# for seven years now, and I keep wondering, why do enums have to be of an integral type? Wouldn't it be nice to do something like:

enum ErrorMessage 
{ 
     NotFound: "Could not find",
     BadRequest: "Malformed request"
}

Is this a language design choice, or are there fundamental incompatibilities on a compiler, CLR, or IL level?

Do other languages have enums with string or complex (i.e. object) types? What languages?

(I'm aware of workarounds; my question is, why are they needed?)

EDIT: "workarounds" = attributes or static classes with consts :)


The purpose of an Enum is to give more meaningful values to integers. You're looking for something else besides an Enum. Enums are compatible with older windows APIs and COM stuff, and a long history on other platforms besides.

Maybe you'd be satisfied with public const members of a struct or a class.

Or maybe you're trying to restrict some specialized types values to only certain string values? But how it's stored and how it's displayed can be two different things - why use more space than necessary to store a value?

And if you want to have something like that readable in some persisted format, just make a utility or Extension method to spit it out.

This response is a little messy because there are just so many reasons. Comparing two strings for validity is much more expensive than comparing two integers. Comparing literal strings to known enums for static type-checking would be kinda unreasonable. Localization would be ... weird. Compatibility with would be broken. Enums as flags would be meaningless/broken.

It's an Enum. That's what Enums do! They're integral!


Perhaps use the description attribute from System.ComponentModel and write a helper function to retrieve the associated string from an enum value? (I've seen this in a codebase I work with and seemed like a perfectly reasonable alternative)

enum ErrorMessage 
{ 
     [Description("Could not find")]
     NotFound,
     [Description("Malformed request")]
     BadRequest
}


What are the advantages, because I can only see drawbacks:

  • ToString will return a different string to the name of the enumeration. That is, ErrorMessage.NotFound.ToString() will be "Could not find" instead of "NotFound".
  • Conversely, with Enum.Parse, what would it do? Would it still accept the string name of the enumeration as it does for integer enumerations, or does it work with the string value?
  • You would not be able to implement [Flags] because what would ErrorMessage.NotFound | ErrorMessage.BadRequest equal in your example (I know that it doesn't really make sense in this particular case, and I suppose you could just say that [Flags] is not allowed on string-based enumerations but that still seems like a drawback to me)
  • While the comparison errMsg == ErrorMessage.NotFound could be implemented as a simple reference comparison, errMsg == "Could not find" would need to be implemented as a string comparison.

I can't think of any benefits, especially since it's so easy to build up your own dictionary mapping enumeration values to "custom" strings.


The real answer why: There's never been a compelling reason to make enums any more complicated than they are. If you need a simple closed list of values - they're it.

In .Net, enums were given the added benefit of internal representation <-> the string used to define them. This one little change adds some versioning downsides, but improves upon enums in C++.

The enum keyword is used to declare an enumeration, a distinct type that consists of a set of named constants called the enumerator list.

Ref: msdn

Your question is with the chosen storage mechanism, an integer. This is just an implementation detail. We only get to peek beneath the covers of this simple type in order to maintain binary compatibility. Enums would otherwise have very limited usefulness.

Q: So why do enums use integer storage? As others have pointed out:

  1. Integers are quick and easy to compare.
  2. Integers are quick and easy to combine (bitwise for [Flags] style enums)
  3. With integers, it's trivially easy to implement enums.

* none of these are specific to .net, and it appears the CLR designers apparently didn't feel compelled to change anything or add any gold plating to them.

Now that's not to saying your syntax isn't entirely unappealing. But is the effort to implement this feature in the CLR, and all the compilers, justified? For all the work that goes into this, has it really bought you anything you couldn't already achieve (with classes)? My gut feeling is no, there's no real benefit. (There's a post by Eric Lippert I wanted to link to, but I couldn't find it)

You can write 10 lines of code to implement in user-space what you're trying to achieve without all the headache of changing a compiler. Your user-space code is easily maintained over time - although perhaps not quite as pretty as if it's built-in, but at the end of the day it's the same thing. You can even get fancy with a T4 code generation template if you need to maintain many of your custom enum-esque values in your project.

So, enums are as complicated as they need to be.



Not really answering your question but presenting alternatives to string enums.

public struct ErrorMessage  
{  
     public const string NotFound="Could not find";
     public const string BadRequest="Malformed request";
} 


Perhaps because then this wouldn't make sense:

enum ErrorMessage: string
{
    NotFound,
    BadRequest
}


It's a language decision - eg., Java's enum doesn't directly correspond to an int, but is instead an actual class. There's a lot of nice tricks that an int enum gives you - you can bitwise them for flags, iterate them (by adding or subtracting 1), etc. But, there's some downsides to it as well - the lack of additional metadata, casting any int to an invalid value, etc.

I think the decision was probably made, as with most design decisions, because int enums are "good enough". If you need something more complex, a class is cheap and easy enough to build.

Static readonly members give you the effect of complex enums, but don't incur the overhead unless you need it.

 static class ErrorMessage {
     public string Description { get; private set; }
     public int Ordinal { get; private set; }
     private ComplexEnum() { }

     public static readonly NotFound = new ErrorMessage() { 
         Ordinal = 0, Description = "Could not find" 
     };
     public static readonly BadRequest = new ErrorMessage() { 
         Ordinal = 1, Description = "Malformed Request" 
     };
 }


Strictly speaking, the intrinsic representation of an enum shouldn't matter, because by definition, they are enumerated types. What this means is that

public enum PrimaryColor { Red, Blue, Yellow }

represents a set of values.

Firstly, some sets are smaller, whereas other sets are larger. Therefore, the .NET CLR allows one to base an enum on an integral type, so that the domain size for enumerated values can be increased or decreased, i.e., if an enum was based on a byte, then that enum cannot contain more than 256 distinct values, whereas one based on a long can contain 2^64 distinct values. This is enabled by the fact that a long is 8 times larger than a byte.

Secondly, an added benefit of restricting the base type of enums to integral values is that one can perform bitwise operations on enum values, as well as create bitmaps of them to represent more than one values.

Finally, integral types are the most efficient data types available inside a computer, therefore, there is a performance advantage when it comes to comparing different enum values.

For the most part, I would say representing enums by integral types seems to be a CLR and/or CLS design choice, though one that is probably not very difficult to arrive at.


The main advantage of integral enums is that they don't take up much space in memory. An instance of a default System.Int32-backed enum takes up just 4-bytes of memory and can be compared quickly to other instances of that enum.

In constrast, string-backed enums would be reference types that require each instance to be allocated on the heap and comparisons to involve checking each character in a string. You could probably minimize some of the issues with some creativity in the runtime and with compilers, but you'd still run into similar problems when trying to store the enum efficiently in a database or other external store.


While it also counts as an "alternative", you can still do better than just a bunch of consts:

struct ErrorMessage
{
    public static readonly ErrorMessage NotFound =
        new ErrorMessage("Could not find");
    public static readonly ErrorMessage BadRequest =
        new ErrorMessage("Bad request");

    private string s;

    private ErrorMessage(string s)
    {
        this.s = s;
    }

    public static explicit operator ErrorMessage(string s)
    {
        return new ErrorMessage(s);
    }

    public static explicit operator string(ErrorMessage em)
    {
        return em.s;
    }
}

The only catch here is that, as any value type, this one has a default value, which will have s==null. But this isn't really different from Java enums, which themselves can be null (being reference types).

In general, Java-like advanced enums cross the line between actual enums, and syntactic sugar for a sealed class hierarchy. Whether such sugar is a good idea or not is arguable.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜