开发者

Data Streaming with Entity Framework

I'm designing an ELT system for a data warehouse and was wondering what is the most effective while still (in some sense) safe way of extracting the data from the source database.

I need to read a couple of tables from the source database, organize it into POCO objects that I can work with effectively. These roughly correspond to the dimensions of my cube. To get the facts into my cube, I need to bulk load huge amounts of data from other tables, make some (non-trivial) transformations on them, and write them into a table in the target database.

Although in principle I would only benefit from a small subset of O/RM features, I'm anyway wondering whether using Entity Framework could be an option. Therefore, my question is whether EF (in its newest version) can handle streaming data. What I mean by that is that I keep some kind of a DataReader 开发者_如何转开发open, load a couple of POCOs, make transformation on them, write the results into the second database, dispose them all as soon as I can (I cannot keep them all in memory cause it would blow up) and continue reading until I'm done.

I obviously don't need any change management for these objects and I want to keep them (at least the second category with facts) alive only for a short period of time and dispose them while still in the same transaction. Disposing means for me that not only I get rid of POCOs, but that EF will not keep any infrastructure and not waste even a single byte of memory on any of those objects anymore.

The advantages that I see in using O/RM is that it could simplify querying and transformation to some extent, but I'm not willing to sacrifice too much performance and I'm limited by the overall memory amount that I can consume. Does it make sense to go for EF or should I better stay by plain old ADO.NET DataReader ?


Use BLTOolkit. We so that - very nice. ONLY has the small subset that is good for ETL. Like not remembering which objects it got in a transaction etc.

If you use EF, you are dead. ORMs are NOT for data loads, they are for business objects. A lot of the higher level features (uniquing, etc.) comes with a HUGE price the moment you move 10 million objects ;)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜