开发者

Complexity Difference Between Where Clause Inside Query and Where Clause Inside Conditional Split In SSIS

I've been wondering, what's the complexity difference between these two cases:

Case 1: An OLE DB Source query, written like this:

开发者_C百科
 select *
 from A
 where A.value > 1

Case 2: An OLE DB Source query, and the where clause is put in Conditional Split

 select * from A

with the Conditional Split after the OLE DB Source query, containing: value > 1.

Performance-wise, does it have any difference? And for more complex query, does it have any significant impact also?


Yes there is a performance difference.

Case 2 will return all data records from table A and store them in the SSIS memory buffer before passing them on to the Conditional Split Component for filtering.

Case 1 immediately returns a smaller data set to the SSIS memory buffer.

For further reading take a look at:

  • Understanding SSIS Data Flow Buffers video
  • Improving Performance of the Data Flow
  • Monitoring Performance of the Data Flow Engine


Main purpose of SSIS is Import/Export with/from heterogeneous source (such as text files, Excel spreadsheets etc). You can make some kind of validation before you loading data to server and it is possible to log "wrong" records somewhere. But if you plan to use SSIS with SQL query only and without any validation or error logging in text files, you should look for another solution (Linked server,OPENROWSET and some more).

But if you are still look at SSIS, you should include all possible logic inside SQL query or data source.

There is article about how to prepare fast extraction using SSIS: http://www.sqllion.com/2009/04/faster-extraction-loading-by-ssis/

Try to avoid type casting or manipulating data types inside the SSIS package as it is an additional overhead for SSIS. Do it prior to the package or do typecasting in the source query of OLE DB Source component.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜