开发者

SQL Join vs Separate Query in Code without Join - Performance

I would like to know if there's a really performance gain between those two options :

Option 1 :

  • I do a SQL Query with a join to select all User and their Ranks.

Option 2 :

  • I do one SQL Query to select all User
  • I fetch all user and do another SQL Query to get the Ranks of this User.

In code, option two is easier to realize for me. That's only because the way I design my Persistence layer.

So, I would like to know what's the impact on performance. After what limit I should consider to take Option 1 i开发者_运维百科nstead of Option 2 ?


Generally speaking, the DB server is always faster at joining than application code. Remember you will have to do an extra query with a network round trip for each join. However, if your first result set is small and your indexes are well tuned, this model can work fine.

If you are only doing this to re-use your ORM solution, then you may be fighting a losing battle. I have invariably found that I need read-only datasets that can only be produced with SQL, so I now use ORM for per-object CRUD operations and regular SQL for searches, reports, aggregates etc.


If ranks are static values, consider caching them in your application.

If you need users frequently and ranks only rarely, consider lazy-loading of ranks. (e.g., separate queries, but the second query gets used only occasionally).

Use the join if you're always going to need both sets of data, and they have to be current copies of the database.

Prototype any likely choices, and run performance tests.

EDIT: Further thoughts on your persistence layer, because I'm facing this one myself. Consider adding "persistence-like" classes that handle joins as their basic query, and are read-only. Whether this fits your particular scenario is for you to decide, but a lot of database access for many apps is based on joins, which can be rather large and complex. If you can handle these in a consistent manner with your persistent, updatable objects, it can be a big win for your overall architecture. Conceptually, it's a lot like having a view in the database, and querying the view instead of writing a join, but you're doing it all in code.


It depends upon how many users you anticipate. Option one will definitely be faster, but with a reasonable amount of data, the difference will be negligible.


In 99% situations join will be faster.

However there is one rare situations when it can be slower. If your are doing one to many join on table with large row size and you are hitting network bandwidth limit.

For example there is a blob column in T1 of 1MB size, you are joining T2 which consist 100 rows for each T1 row. The result set would be T1 row count multiple 100.

So if you are querying one T1 row with join it would be 100MB result set, if you fetch T1 row (1MB) and then do separate select to fetch 100 T2 for this T1 the result set will be 1MB.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜