LINQ - How do I keep my (complex) results ordered?
I have LINQ query that is built up in a piecemeal fashion like so:
var initialQuery = from item in MyContext where xxx == yyy select item;
var furtherQuery = from item in initialQuery where bla == foo select new { some stuff };
// more code here...
// eventually:
var yetAnotherQuery = (from item in furtherQuery ...)
.OrderBy(my_condition);
// As far as I know, the following query should still maintain the order of the previous one
// see: https://s开发者_JS百科tackoverflow.com/questions/911942/are-subqueries-in-linqtosql-guaranteed-to-be-in-the-same-order-as-their-parent
var stillAnotherQuery = (from item in yetAnotherQuery
select item.data_I_care_about)
.Distinct();
// And finally...
var finalQuery = stillAnotherQuery.Skip(PageIndex).Take(PageSize);
But I am getting an exception when Skip()
is called, saying that the query is not ordered!
So apparently what is indicated in my code comment above and the referenced SO question is not entirely true. In fact another SO answer indicates that the preservation of that order is not guaranteed.
Does anyone know a good way to do what I am trying to accomplish?
I considered simply including a ROW_NUMBER
in my intermediate results, and ordering by that at the very end, but I cannot find a way to get that ROW_NUMBER
in my results via LINQ.
I have seen several other SO questions trying to get the ROW_NUMBER in there, but they are all clientside, as far as I have seen.
I seem to have painted myself in a corner. Anyone know a (LINQ-friendly) way out?
UPDATE
Some have suggested that I do the Distinct()
before the OrderBy()
.
I believe that would give me different results.
Imagine I have this table of data
myRank | myData
-------+--------
3 | A
1 | B
2 | A
Supposing I am ordering by myRank
, and the data I care about is myData
, and imagine my original code was like this:
var query = from item in MyTable
select item;
query = query.OrderBy(item => item.myRank);
var derivedQuery = from item in query // Note: we throw away myRank
select item.myData;
derivedQuery = derivedQuery.Distinct();
If I swap the order of the OrderBy()
and Distinct()
, I will get different results. I do not want myRank
to be included in the Distinct()
.
Sorry, this is part of a much larger process, so it is hard to get all the details into this question.
But hopefully that makes sense?
The problem is not that the elements are no longer ordered.
Rather, the problem is that Distinct() returns a IQueryable<T>
whereas OrderBy() returns IOrderedQueryable<TSource>
, which is (apparently) required for the paging to work with EF
Swapping the Distinct() and OrderBy() steps should fix things
Edit
I suggest something roughly like this:
var query = from item in MyTable
select item;
query = query.GroupBy(item => item.myData, item => item.myRank);
var derivedQuery = query.OrderBy(group => group.Min())
.Select(group.Key);
Clarifying:
- group by will result in dinstinct keys (myData) anyway
- sorting by the Min() of
myRank
per group will effect the same order as sorting (ascending) by myRank globally first, then doing a Distinc()
Because of its implementation, Distinct
doesn't guarantee that the order is preserved. You can't run Skip
on its output because of this, and even if you could, you wouldn't be able to rely on Skip
's behavior.
You're going to need to run OrderBy
after you run Distinct
. I don't see a way to do this with a single call to OrderBy
, given what you're trying to do. yetAnotherQuery
and stillAnotherQuery
have different elements in them, so they're going to need to be sorted separately.
The way I see it, you have two choices. You can either make one trip to the database for yetAnotherQuery
, then run Distinct
and OrderBy
locally for stillAnotherQuery
:
var yetAnotherQuery = (from item in furtherQuery ...)
.OrderBy(my_condition);
var stillAnotherQuery = (from item in yetAnotherQuery ...)
.Distinct()
.OrderBy(my_condition);
var finalQuery = stillAnotherQuery.Skip(PageIndex).Take(PageSize);
...or you can make two trips to the database, but do all the sorting and filtering on your database server:
var yaq_UnSorted = (from item in furtherQuery ...);
var yetAnotherQuery = yaq_UnSorted
.OrderBy(my_condition);
var stillAnotherQuery = (from item in yaq_UnSorted ...)
.Distinct()
.OrderBy(my_condition);
var finalQuery = stillAnotherQuery.Skip(PageIndex).Take(PageSize);
Which solution is best for you will depend on your priorities. The first one will reduce the number of roundtrips to your DB and the amount of data transferred, but it will also use more resources on your app server.
The Distinct
/ OrderBy
problem is a pretty common snag to run into. Here's a previous answer of mine on the subject.
Does it work to put the OrderBy
method after the Distinct
method?
The problem is that Distinct
doesn't preserve order.
Reference: http://programminglinq.com/blogs/marcorusso/archive/2008/07/20/use-of-distinct-and-orderby-in-linq.aspx
Update
Now I see the problem. I think you need to try using GroupBy
instead of Distinct
.
Something like this:
var query = MyTable
.OrderBy(item => item.myRank)
.GroupBy(item => item.myData)
.Select(grouping => grouping.First().myData);
I Don't think the problem is that Distinct()
changes the order - but it returns a "normal" IQueryable<T>
instead of a IOrderedQueryable<T>
which, apparently is required for Linq to Entities Skip
implementation.
One solution could be reversing the order of the OrderBy()
and Distinct()
methods, so that you do the Distinct()
first, then apply the order later, and then use Skip()
.
精彩评论