Services for sharing data?
Picture an organization with many departments and systems, many of which must share data. For example, one deparment/system is responsible for updating customer data but many departments/systems must be able to read and relate that customer data to their own data. In such a situation, what is your view on using web services to share this core business data across multiple systems?
It seems there are several usage models to consider:
- Application retrieving a single record – No real issues here. There may be a small performance degradation when switching from other mechanisms (e.g. duplicating data across multiple databases and then using database queries to retrieve) to web services but that may be acceptable in many cases.
- Application showing a table of data where the data source for the columns span multiple databases – For example, showing a list of customer names alongside their orders. If both are in one data source then this is easy to develop and performance is quick. If customer names are pulled from CRM’s database and order info is pulled using the Order System's web service then development is more difficult because the data has to be merged and execution gets slow because the middle tier has two (or more) data retrieval operations and then must traverse more than one data set.
- Application searching for data where the search criteria spans multiple systems – If the system can’t join data elements and specify all of the开发者_StackOverflow中文版 predicates and sort order in one query then this becomes very difficult to develop and very slow to execute.
Note that each of these usage models attempts to use web services to retrieve data within one organization/network. Can web services be made to suit each of these usage models?
Sharing data in the three ways you've out-lined (via Web Services) is indeed possible, but there's lots of other factors which will drive the right approach.
Architectural aspects to consider:
- Data Architecture (business ownership, security, management and so on).
- Data quality - is it at an appropriate level for the consuming applications?
- How will you keep related data (in related systems) in sync?
- The case of use: Transactional (OLTP) vs Analytical (OLAP).
Things to consider when integrating with another system:
- What's the case of use: Transactional (OLTP) or Analytical (OLAP)?
- How many requests are we talking about and when do they occur, etc?
- How complex are the queries likely to be that service these requests?
- Exactly how up-to-date does the data need to be?
If your data exists in a single DB that's serving a transactional system then any queries made against it can affect performance of that system; if you have many systems hitting that database then you've got a crucial piece of data infrastructure to protect - it goes down so does everything else.
Databases written to support transactional systems often aren't suited to analytical ones. A common approach where I work is to ETL out copies of the data into a repository designed for the purpose in mind. The other advantage is that (as it's in a separate system) you won't get performance issues on either side.
Assuming you do ETL out the data to a new repository which source is the authoritative one? It might be that several applications ETL data into a central Data Mart or Data Warehouse where that data is integrated - and that becomes the "source of truth".
In closing - working with data (both logically and physically) is a lot like Object Orientated programming / design: a lot of the underlying principles (like SOLID) readily apply.
精彩评论