开发者

Hierarchical data in ETL

I'm new to ETL tool, but what I find out evaluation them that they all work with flat row model. I.e. if your data requires some graph object trans开发者_C百科formation (i.e. checking parent fields or other dependedcy) it is very inconvenient(it is solvable with denormalization etc for mapping to simpler RDB model). I want to ask whether I understood it correctly. And why ETL avoid working with more understandable to the business object oriented models. Is there are ETL which support Document related, or OOP related transformation?


I'm not sure if I fully understand the question, but some thoughts to consider:

  • Most of the ETL paradigms come from the data integration and decision support world, namely from data warehouse design and implementation. This world is traditionally relational database-oriented, with most of the data sources existing as database tables or CSV files. This might be a reason of the "flat row model".
  • The simple data model is useful for high throughput performance, and not over-restricting in most cases: ETL tools are used for heavily data-intensive tasks.
  • Most of the tools I know assume that source records are processed independently of each other, they don't affect each other. However, this is not always the case, as some tools enable for example aggregating the data (eg. Informatica aggregator element) - the data model is not so flat any more.
  • Other examples widening the flat model include checking foreign key dependencies ("parent fields"), use of dictionary tables (or even web services), definition of external classes doing arbitrary operations ("OOP") etc. However, the ETL data model always stays on a lower level of abstraction.


Altova MapForce can work with hierarchical data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜