Where to store events in a distributed system which uses event sourcing?
Given you have multiple systems, which are integrated by events, and all of them are using event sourcing. Where do you store the events?
In my case I have three systems:
- A website, which is a shop
- A backend for the Website to manage customers, products etc.
- An accounting system
Whenever a domain event happens in one of those systems the event is published and can be processed by the other systems. All systems are using event sourcing.
I am wondering where you would save the events. Of course each system has to store all events that it processed because it is using event sourcing and therefore depends on the events it once processed.
But what about the other events that where not needed and therefore the system did not subscribe to? I am struggling with the fact that requirements can change, such that a system would have to process events from the past that it did not persist. Where would you get these events from, if the system needed to process events that it did not subscribe when they occured?
I think there is a big difference to systems that do not use event sourcing at this point. If you have to implement a feature in a system A which depends on data, that is not available in A, but in another system B, and you persistent current state via a ORM tool like NHibernate you can simply import that data from A to B. Since a system, that uses event sourcing, depends on events to get to it's current state you have to import all the events that you missed in the past but are need now.
For me there are a few different approaches to this problem.
- Each system saves all events that is publishes. This gives you the ability to republish the events if needed or to import them into another system.
- Each system saves all events that happen, even those which do not need to be processed (yet).
- All events from all system are stored in central event log. If you need to proccess a event that happened in the past but you did not subscribe to you can import it from here.
How do you handle such a situation? Where do you save your events?
Edit
Thanks Roy Dictus for your answer. I'm still not sure how to handle the following situation:
The website publishes the events CustomerRegistered, CustomerPurchasedProduct and CustomerMarkedProductAsFavorite. In the current version of the backend customers haave to be d开发者_运维技巧isplayed and their purchases have to be displayed. What a customer marked as a favorite is not of interest in that version of the system. Thus the backend only subscribed to CustomerRegistered and CustomerPurchasedProduct.
Now the marketing department also wants the information about the favorite products to be shown on the customer details page. Since the backend didn't subscribe to CustomerMarkedProductAsFavorite this information is not available in the backend. Where do I get that information from?
- Each system stores its own events. Each system is its own CQRS system, or at least its own self-contained service, and therefore is responsible for its own data.
- Each system also publishes its event to a service bus. This service bus determines where it saves these events. Usually it is in a transactional queuing system.
- Each system subscribes to the outside events it consumes. It does not store these incoming events, only its own events that result from them. When it consumes an incoming event, the service bus knows it can delete the event from that service's incoming queue.
EDIT to accommodate your extra question:
If another application suddenly becomes interested in extra information, it has to add listeners to the events it is now interested in.
Furthermore, all sources of these events can then replay those events. Replay is a powerful feature of event-driven systems that allows for such scenarios. So, the event sources replay only the selected events (say, all CustomerMarkedItemAsFavorite events of the last 6 months). Systems that have already consumed these events should recognize that the events replayed are "old" ones (i.e., ones that it has already processed) and ignore them.
This way, any subsystem that is updated to use extra information from the other subsystems can get that information and get all up-to-date in a single batch operation.
WRT your edit: Is it really a requirement to also access the historical CustomerMarkedProductAsFavorite data. Change the backend to subscribe to the new data and then you have it going forward. You can work out how to backfill the missing data as a separate issue if you really need it.
Roy has already outlined one possible architecture that would ensure you have theCustomerMarkedProductAsFavorite data to backfill in future.
精彩评论