Why does string not implement IList<char>?
Title says all
Why does String
im开发者_如何学JAVAplement IEnumerable<char>
and not IList<char>
?
A string has a length and you can already take elements from a specific index.
And it could be indicated that it is immutable withICollection<char>.IsReadOnly
.
So what could be wrong with it? Am I missing something?
As many answers point on that:
There is not an interface for read only lists/collections alone, but as ReadOnlyCollection<T>
shows I think its definitely possible and with it's IsReadOnly
Property designed for such cases.
public class String : IList<char>
{
int IList<char>.IndexOf(char item)
{
// ...
}
void IList<char>.Insert(int index, char item)
{
throw new NotSupportedException();
}
void IList<char>.RemoveAt(int index)
{
throw new NotSupportedException();
}
char IList<char>.this[int index]
{
set
{
throw new NotSupportedException();
}
}
void ICollection<char>.Add(char item)
{
throw new NotSupportedException();
}
void ICollection<char>.Clear()
{
throw new NotSupportedException();
}
public bool Contains(char item)
{
// ...
}
public void CopyTo(char[] array, int arrayIndex)
{
// ...
}
int ICollection<char>.Count
{
get { return this.Length; }
}
public bool IsReadOnly
{
get { return true; }
}
bool ICollection<char>.Remove(char item)
{
throw new NotSupportedException();
}
// ...
}
Mainly because IList
inherits from ICollection
, and strings do not support the mutable beahviors ICollection
promises (add, remove, & clear) - when you add or remove character elements from a string, it creates a new string.
Implementing part of an interface is bad design. You might do it sometimes out of necessity, but it's never a good choice. It's better to split the interface. (The .NET collection interfaces should probably have been split to segregate the mutable members from the diagnostic ones.) Meta-functions like IsReadOnly
make the interface less coherent and make it harder to use.
So even though a string is in many respects a list - you can index a string and find the count of characters in it - it's pretty rare that someone's actually going to want to treat a string like an IList<char>
, especially in a post-3.5-world when there are simple transformations available that make it easy to get this information from an IEnumerable<char>
.
The fact that ReadOnlyCollection
supports IList
is in my opinion highly: meh.
IList
is a contract that indicates that implementers support collection mutation (e.g. Add, Remove, Insert) which on an immutable object is clearly wrong. Carrying this over to System.String
is just plain wrong as it is also immutable by design.
System.String
implementing IList
would be a terrible API design as a bunch of the methods would not work, so strings would not work in the same way as types that fully implement IList
.
Perhaps you are hopping that it supported a more liberal interface, sure, but IList
is not the right choice.
Partial interface implementation like this breaks the Liskov substitution principle, and introduces potential runtime bugs.
Update
Interestingly enough, .Net 4.5 introduces the new IReadOnlyList interface. However, String
does not implement it, and it could not be introduced into the IList
hierarchy.
Some background: http://www.infoq.com/news/2011/10/ReadOnly-WInRT.
Why should it? I think there isn't many cases where that would be useful. Most of the time, when you have string, you know it's a string. And when you don't, IEnumerable<char>
is often good enough.
What's more IList<T>
(and ICollection<T>
) are interfaces for mutable collections, which string
isn't. (Although ReadOnlyCollection<T>
does implement them.)
Mainly, that it isn't a IList (you can't "hello world".Add('!')
-- this is an interface and contract incompatibility; it is not just a read only list, because that would 'know' that Add
operation, and throw on invocation).
Also strings have special semantics - storage optimization and identity aliasing come to mind (there can be small string optimizations, there can be interned strings). These wouldn't stand out once you pass as IList<char>
- people might start to expect 'normal' List<>-like semantics.
When seeing IEnumerable<char>
, however, no such expectations are raised (it is just saying: I am able to give you a number of characters in succession, you don't need to know where they come from).
I am not sure why the alternative below has not been brought up yet...
System.String does support this (albeit indirectly). Yes it makes another copy in memory, but I believe that is designed that way because strings are immutable.
string myString = "hello world";
IList<char> myIList = myString.ToCharArray();
The fact that String
does not implement IList<char>
doesn't surprise me. In fact, what does surprise me somewhat, is that it implements IEnumerable<char>
.
Why? Because a string is not really a sequence of chars.
You see, there are 1,114,112 code points in Unicode, but a char
is only 16 bits. The string contains a sequence of characters (i.e. unicode code points), that is encoded using UTF16 in a number of char
values. As a result, the number of unicode characters in a string may be less than the number of char
values in the string.
Now, I do realize that this sounds very weird to most people, especially those that speak English, because they have always been happy with ASCII, and assumed that one char
equals one character. In many cases, that assumption is even true. That may be the reason why string
implements IEnumerable<char>
, as some sort of compromise with a legacy, non-unicode world.
But the truth is that we cannot possible answer your question. The only ones that can tell you why they designed strings the way they did, are the people on the BCL team at the time.
精彩评论