What is the difference between where and join?
What is the difference between
var q_nojoin = from o in one
from t in two
where o.SomeProperty == t.SomePrope开发者_如何学Pythonrty
select new { o, t };
and
var q_join = from o in one
join t in two on o.SomeProperty equals t.SomeProperty
select new { o, t };
They seem to give me the same results.
They give the same result, but the join is very much faster, unless you use LINQ to SQL so that the database can optimise the queries.
I made a test with two arrays containing 5000 items each, and the query with a join was about 450 times faster (!) than the query without a join.
If you use LINQ to SQL, the database will optimise both queries to do the same job, so there is no performance difference in that case. However, an explicit join is considered more readable.
If you are using LINQ against a different data source, there is no optimising layer, so there is a significant difference in how the queries work. The join uses a hash table or similar to quickly look up matching values, while the query without a join will compare all items in one table with each item in the other table. The complexity of the join is roughly O(n+m), while the complexity of the query without the join is O(n*m). This means not only that the query without the join is slower, but also that it scales badly, so as the data grows it will get exponentially slower.
A JOIN is a means for combining fields from two (or more) tables by using values common to each.
A WHERE clause specifies that a SQL (data manipulation language) statement should only affect rows that meet specified criteria (think of a WHERE clause as a FILTER).
in practice, depending on lots of other factors, you can get performance gains by using one over another. I would imagine (though I have no basis in fact for this) that joins are more sargable than WHERE clauses.
edit: turns out I'm totally wrong. There (should be) no difference to the performance between the two types. However, the newer style (using JOIN) is a lot clearer to read (imo) and also, Microsoft have said that they won't be support the older style (outer-join using WHERE) indefinately.
Actually in SQL, join-on
statements can be written in from-where
statements(if you really want). But you know we have left join
left outer join
and etc in SQL statements, which make us easier to express what we want (of course you can also use from-where
but it will make your code look crazy). So we always use where
if we want to filter our result, while use join
if there are relationship between tables.
The first query is saying, in effect, "Do a cross join on these collections (creating essensially a NxM matrix), then take only those that are along the diagonal, and give them to me"
The second query is, in effect, "Create a list of just the combined items where the proeprties match".
The results are the same, but the process of getting there is a bit different.
Since SQL databases are generally highly optimized, so when you ask for the first, the server just says "Idiot user....", and substitutes the second.
In non-SQL environments (like Linq-to-Objects), if you ask for the first, that's what it will do, and you'll see a significant performance hit.
精彩评论