How to define IEnumerable behavior by contract?
Consider this 2 methods that returns IEnumerable:
private IEnumerable<MyClass> GetYieldResult(int qtResult)
{
for (int i = 0; i < qtResult; i++)
{
count++;
yield return new MyClass() { Id = i+1 };
}
}
private IEnumerable<MyClass> GetNonYieldResult(int qtResult)
{
var result = new List<MyClass>();
for (int i = 0; i < qtResult; i++)
{
count++;
result.Add(new MyClass() { Id = i + 1 });
}
return result;
}
This code shows 2 different behaviors when calling some method of IEnumerable:
[TestMethod]
public void Test1()
{
count = 0;
IEnumerable<MyClass> yieldResult = GetYieldResult(1);
var firstGet = yieldResult.First();
var secondGet = yieldResult.First();
Assert.AreEqual(1, firstGet.Id);
Assert.AreEqual(1, secondGet.Id);
Assert.AreEqual(2, count);//calling "First()" 2 times, yieldResult is created 2 times
Assert.AreNotSame(firstGet, secondGet);//and created different instances of each list item
}
[TestMethod]
public void Test2()
{
count = 0;
IEnumerable<MyClass> yieldResult = GetNonYieldResult(1);
var firstGet = yieldResult.F开发者_StackOverflow社区irst();
var secondGet = yieldResult.First();
Assert.AreEqual(1, firstGet.Id);
Assert.AreEqual(1, secondGet.Id);
Assert.AreEqual(1, count);//as expected, it creates only 1 result set
Assert.AreSame(firstGet, secondGet);//and calling "First()" several times will always return same instance of MyClass
}
It's simple to choose which behavior I want when my code returns IEnumerables, but how can I explicitly define that some method gets an IEnumerable as parameter that creates a single result set dispite of how many times it calls "First()" method.
Of course, I don't want to force all itens to be created unnecessarily and I want to define the parameter as IEnumerable to say that no item will be included or removed from the collection.
EDIT: Just to be clear, the question is not about how yield works or why IEnumerable can return different instances for each call. The question is how can I specify that a parameter should be a "search only" collection that returns same instances of MyClass when I call methods like "First()" or "Take(1)" several times.
Any ideas?
Thanks in advance!
Of course, I don't want to force all itens to be created unnecessarily
In which case you need to allow the method to create them on demand, and if objects are created on demand (and without some form of cache) they will be different objects (at least in the sense of being different references—the default definition of equality for non-value objects).
If your objects are inherently unique (i.e. they don't define some value based equality) then each call to new
will create a different object (whatever the constructor parameters).
So the answer to
but how can I explicitly define that some method gets an IEnumerable as parameter that creates a single result set dispite of how many times it calls "First()" method.
is "you can't" except by creating one set of objects and repeatedly returning the same set, or by defining equality to be something different.
Additional (based on comments). If you really want to be able to replay (for want of a better term) the same set of objects without building the whole collection you could cache want has already been generated and replay that first. Something like:
private static List<MyData> cache = new List<MyData>();
public IEnumerable<MyData> GetData() {
foreach (var d in cache) {
yield return d;
}
var position = cache.Count;
while (maxItens < position) {
MyData next = MakeNextItem(position);
cache.Add(next);
yield return next;
}
}
I expect it would be possible to build such a caching wrapper around an iterator as well (while
would become foreach
over underlying iterator, but you would need to cache that iterator or Skip
to the require position if the caller iterated beyond the cahing List
).
NB any caching approach would be hard to make thread safe.
I've been trying to find an elegant solution to the problem for a while now. I wish that the framework designers had added a little "IsImmutable" or similar property getter to IEnumerable so that one could easily add an Evaluate (or similar) extension method that doesn't do anything for an IEnumerable that is already in its "fully evaluated" state.
However, since that doesn't exist, here's the best I've been able to come up with:
- I've created my own interface that exposes the immutability property, and I implement it in all of my custom collection types.
- My implementation of the Evaluate extension method is aware of this new interface as well as the immutability of the subset of relevant BCL types that I consume most frequently.
- I avoid returning "raw" BCL collection types from my APIs in order to increase the efficiency of my Evaluate method (at least when running against my own code).
It's rather kludgy, but it's the least intrusive approach I've been able to find so far to address the problem of allowing an IEnumerable consumer to create a local copy only when this is actually necessary. I very much hope that your question lures some more interesting solutions out of the woodwork...
Unless I'm misreading you, your question may be caused by a misunderstanding.. Nothing ever returns an IEnumerable. The first case returns an Enumerator, which implements foreach, allowing you to get instances of MyClass, one at a time. It, (the function return value) is typed as IEnumerable to indicate that it supports the foreach behavior (and a few others)
The second function actually returns a List, which of course also supports IEnumerable (foreach behavior). But it is an actual concrete collection of MyClass Objects, created by the method you called (the second one)
The first method doesn't return any MyClass Objects at all, it returns that enumerator object, which is created by the dotNet framework and coded behind the scenes to instantiate a new MyClass object each time you iterate against it.
EDIT: More detail A more important distinction is whether or not you want the items to be statefully held in place for you within the class, while you iterate, or whether you want them created for you when you iterate.
Another consideration is.. are the items you wish returned to you already in existence somewhere else? i.e., is this method going to iterate through a set (or filtered subset) of some existing collection? or is it creating the items on the fly? if the latter, does it matter if the item is the exact same instance each time you "get" it? For objects defined t orepresent things that could be called an entity - ssomething with a defined identity, you probably want successive fetches to return the same instance.
But maybe another instance with the same state is totally equivilent? (This would be called a value type object, like a telephone Number, or an address, or a point on the screen. Such objects have no identity except that implied by their state. In this latter case, it doesn't matter if the enumerator returns the same instance or a newly created identical copy each time you "get" it... Such objects are generally immutable, they are the same, they stay the same, and they function identically.
You can mix the suggestions, you can implement an wrapper class, generics-based, that takes the IEnumerable and returns a new one that constructs a cache on each next, and reuses the partial cache as needed on further enumerations. It is not easy, but will create objects (in truth only for Iterators that construct objects on-the-fly) only once and as needed. The hardest part is to be sure when to switch from the partial cache back to the original enumerator and how to make it transactional (consistent).
Update with tested code:
public interface ICachedEnumerable<T> : IEnumerable<T>
{
}
internal class CachedEnumerable<T> : ICachedEnumerable<T>
{
private readonly List<T> cache = new List<T>();
private readonly IEnumerator<T> source;
private bool sourceIsExhausted = false;
public CachedEnumerable(IEnumerable<T> source)
{
this.source = source.GetEnumerator();
}
public T Get(int where)
{
if (where < 0)
throw new InvalidOperationException();
SyncUntil(where);
return cache[where];
}
private void SyncUntil(int where)
{
lock (cache)
{
while (where >= cache.Count && !sourceIsExhausted)
{
sourceIsExhausted = source.MoveNext();
cache.Add(source.Current);
}
if (where >= cache.Count)
throw new InvalidOperationException();
}
}
public bool GoesBeyond(int where)
{
try
{
SyncUntil(where);
return true;
}
catch (InvalidOperationException)
{
return false;
}
}
public IEnumerator<T> GetEnumerator()
{
return new CachedEnumerator<T>(this);
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return new CachedEnumerator<T>(this);
}
private class CachedEnumerator<T> : IEnumerator<T>, System.Collections.IEnumerator
{
private readonly CachedEnumerable<T> parent;
private int where;
public CachedEnumerator(CachedEnumerable<T> parent)
{
this.parent = parent;
Reset();
}
public object Current
{
get { return Get(); }
}
public bool MoveNext()
{
if (parent.GoesBeyond(where))
{
where++;
return true;
}
return false;
}
public void Reset()
{
where = -1;
}
T IEnumerator<T>.Current
{
get { return Get(); }
}
private T Get()
{
return parent.Get(where);
}
public void Dispose()
{
}
}
}
public static class CachedEnumerableExtensions
{
public static ICachedEnumerable<T> AsCachedEnumerable<T>(this IEnumerable<T> source)
{
return new CachedEnumerable<T>(source);
}
}
With this you can now add a new Test that shows it works:
[Test]
public void Test3()
{
count = 0;
ICachedEnumerable<MyClass> yieldResult = GetYieldResult(1).AsCachedEnumerable();
var firstGet = yieldResult.First();
var secondGet = yieldResult.First();
Assert.AreEqual(1, firstGet.Id);
Assert.AreEqual(1, secondGet.Id);
Assert.AreEqual(1, count);//calling "First()" 2 times, yieldResult is created 2 times
Assert.AreSame(firstGet, secondGet);//and created different instances of each list item
}
Code will be incorporated at my project http://github.com/monoman/MSBuild.NUnit , may later appear in the Managed.Commons project too
Then you need to cache the result, an IEnumerable is always re-executed when you call something that iterates over it. I tend to use:
private List<MyClass> mEnumerable;
public IEnumerable<MyClass> GenerateEnumerable()
{
mEnumerable = mEnumerable ?? CreateEnumerable()
return mEnumerable;
}
private List<MyClass> CreateEnumerable()
{
//Code to generate List Here
}
Granted on the other side (say for your example) you can have the ToList Call at the end here will iterate and create a list that is stored, and yieldResult will still be an IEnumerable without an issue.
[TestMethod]
public void Test1()
{
count = 0;
IEnumerable<MyClass> yieldResult = GetYieldResult(1).ToList();
var firstGet = yieldResult.First();
var secondGet = yieldResult.First();
Assert.AreEqual(1, firstGet.Id);
Assert.AreEqual(1, secondGet.Id);
Assert.AreEqual(2, count);//calling "First()" 2 times, yieldResult is created 1 time
Assert.AreSame(firstGet, secondGet);
}
精彩评论