Using REST webservices for ETL / Datawarehousing
Has anyone used a REST-based approach for ETL / Datawarehousing operations? In other words, invoking ETL and OLAP / Database refresh jobs through REST webservices calls:
e.g. PUT http://company.com/cube/123523 (to refresh a sp开发者_运维问答ecific OLAP cube with new data) or POST http://company.com/view/patients/123123 (to create a new database view for patients)
Seems to me like REST is a very suitable and clean architectural style for modeling such monthly tasks....
ETL is all about inserting rows into a database very, very fast (or sometimes, very, very flexibly when the data is a bit dicey and requires automated cleanup).
REST means using all of HTTP, so using all the verbs and generally the a unicode-way of life.
HTTP as a protocol isn't very fast. It isn't binary (all though I suppose you can have binary payload)
ETL problems are really looking for solutions that depend on the data source. Does your datasource have a native, binary protocol? Use that, it usually is the fastest.
All that said, there are data sources that are locked behind port 80. Things like Microsoft's ADO.NET Data Services (Astoria) already are working out the details of a REST based data access API. I'd be surprised if it is high performance, but it certainly seems like it would be highly flexible.
精彩评论