开发者

Indexers in List vs Array

How are the Indexers are defined in List and Arrays.

List<MyStruct> lists=new List<MyStruct>(); where MyStruct is a Structure. Now Consider MyStruct[] arr=new MyStruct[10];

arr[0] gives a reference to the first Structure item.But lists[0] gives me a copy of it. Is there any reason why it is done like that. Also since Int32 is structure List<Int32> list1 =new List<Int32>(); how it is possible for me to access list1[0] or assign list1[0]=5 开发者_JAVA技巧where as it is not possible to do lists[0]._x=5


Although they look the same, the array indexer and list indexer are doing completely separate things.

The List<T> indexer is declared as a property with a parameter:

public T this[int index] { get; set; }

This gets compiled to get_Item and set_Item methods that are called like any other method when the parameter is accessed.

The array indexer has direct support within the CLR; there is a specific IL instruction ldelema (load element address) for getting a managed pointer to the n'th element of an array. This pointer can then be used by any of the other IL instructions that take a pointer to directly alter the thing at that address.

For example, the stfld (store field value) instruction can take a managed pointer specifying the 'this' instance to store the field in, or you can use the pointer to call methods directly on the thing in the array.

In C# parlance, the array indexer returns a variable, but the list indexer returns a value.


The final point:

lists[0]._x=5

is actually just a restatement of your earlier point:

arr[0] gives a reference to the first Structure item.But lists[0] gives me a copy of it.

If you edited a copy of it, the change would be lost into the ether, i.e.

var throwawayCopy = lists[0];
throwawayCopy._x = 5;
// (and never refer to throwawayCopy again, ever)

Since that is almost certainly not what you intended, the compiler doesn't let you. However, mutable structs are evil. A better option here would be don't use mutable structs. They bite.


Taking this down a level, to a simple but concrete example:

using System;
struct Evil
{
    public int Yeuch;
}
public class Program
{
    public static void Main()
    {
        var pain = new Evil[] { new Evil { Yeuch = 2 } };
        pain[0].Yeuch = 27;
        Console.WriteLine(pain[0].Yeuch);
    }
}

This compiles (looking at the last 2 lines here) as:

L_0026: ldloc.0 <== pain
L_0027: ldc.i4.0 <== 0
L_0028: ldelema Evil <== get the MANAGED POINTER to the 0th element
                           (not the 0th element as a value)
L_002d: ldc.i4.s 0x1b <== 27
L_002f: stfld int32 Evil::Yeuch <== store field

L_0034: ldloc.0 <== pain
L_0035: ldc.i4.0 <== 0
L_0036: ldelema Evil <== get the MANAGED POINTER to the 0th element
                           (not the 0th element as a value)
L_003b: ldfld int32 Evil::Yeuch <== read field
L_0040: call void [mscorlib]System.Console::WriteLine(int32) <== writeline
L_0045: ret 

It never actual talks to the struct as a value - no copies, etc


List<T> has a normal indexer which behaves like a property. The access goes through accessor functions, and those are by-value.

T this[int index]
{
    get{return arr[index];}
    set{arr[index]=value;}}
}

Arrays are special types, and their indexer is field-like. Both the runtime and the C# compiler have special knowledge of arrays, and that enables this behavior. You can't have the array like behavior on custom types.

Fortunately this is rarely a problem in practice. Since you only use mutable structs in rare special cases(high performance or native interop), and in those you usually prefer arrays anyways due to their low overhead.


You get the same behavior with properties vs. fields. You get a kind of reference when using a field, but a copy when you use a property. Thus you can write to members of value-type fields, but not members of value-type properties.


I ran into this as well, when I was inspecting lambda expression types. When the lambda is compiled into an expression tree you can inspect the expression type for each node. It turns out that there is a special node type ArrayIndex for the Array indexer:

Expression<Func<string>> expression = () => new string[] { "Test" }[0];
Assert.Equal(ExpressionType.ArrayIndex, expression.Body.NodeType);

Whereas the List indexer is of type Call:

Expression<Func<string>> expression = () => new List<string>() { "Test" }[0];
Assert.Equal(ExpressionType.Call, expression.Body.NodeType);

This just to illustrate that we can reason about the underlying architecture with lambda expressions.


Your problem isn't with List<>, it's with structs themselves.

Take this for example:

public class MyStruct
{
    public int x;
    public int y;
}

void Main()
{
    var someStruct = new MyStruct { x = 5, y = 5 };

    someStruct.x = 3;
}

Here, you're not modifying the value of x of the original struct, you're creating a new object with y = y, and x = 3. The reason you can't directly modify this with a list, is because the list indexer is a function (as opposed to the array indexer), and it doesn't know how to 'set' the new struct in the list.

Modify the keyword struct to class and you'll see it works just fine (with classes you're not creating a brand new object every time you mutate it).


One unfortunate limitation of .net languages is that they don't have any concept of a property doing anything other than returning a value, which can then be used however the caller sees fit. It would be very helpful (and if I had a means of petitioning for language features, I'd seek this) if there were a standard compiler-supported means of exposing properties as delegate callers, such that a statement like:

  MyListOfPoints[4].X = 5;

could be translated by the compiler into something like:

  MyListOfPoints.ActOnItem(4, (ref Point it) => it.X = 5);

Such code could be relatively efficient, and not create any GC pressure, if ActOnItem took an extra ref parameter of generic type, and passed it to a delegate which also took a parameter of that type. Doing that would allow the called function to be static, eliminating the need to create a closures or delegates for each execution of the enclosing function. If there were a way for ActOnItem to accept a variable number of generic 'ref' parameters, it would even be possible to handle constructs like:

  SwapItems(ref MyListOfPoints[4].X, ref MyListofPoints[4].Y);

with arbitrary combinations of 'ref' parameters, but even just having the ability to handle the cases where the property was "involved in" the left of an assignment, or a function was called with a single property-ish 'ref' parameter, would be helpful.

Note that being able to do things this way would offer an extra benefit beyond the ability to access fields of structs. It would also mean that the object exposing the property would receive notification that the consumer was done with it (since the consumer's delegate would return). Imagine, for example, that one has a control that shows a list of items, each with a string and a color, and one wants to be able to do something like:

  MyControl.Items(5).Color = Color.Red;

An easy statement to read, and the most natural-reading way to change the color of the fifth list item, but trying to make such a statement work would require that the object returned by Items(5) have a link to MyControl, and send it some sort of notification when it changed. Rather complicated. By contrast, if the style of call-through indicated above were supported, such a thing would be much simpler. The ActOnItem(index, proc, param) method would know that once proc had returned, it would have to redraw the item specified by index. Of some importance, if Items(5) were a call-through proc and didn't support any direct read method, one could avoid scenarios like:

  var evil = MyControl.Items(5);
  MyControl.items.DeleteAt(0);
  // Should the following statement affect the item that used to be fifth,
  // or the one that's fifth now, or should it throw an exception?  How
  // should such semantics be ensured?
  evil.Color = Color.Purple;

The value of MyControl.Items(5) would remain bound to MyControl only for the duration of the call-through involving it. After that, it would simply be a detached value.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜