Converging processes

TL;DR

Completing an order is not an easy process. A payment gateway may not accept payments for some time. A coffee can be spilled all over the book you ordered and a new one needs to be taken from a storage. Failures may occur in different moments of the pipeline, but still your ordered is received. Is it a one process or multiple? If one, does anyone takes every single piece into consideration? How to ensure that eventually a client will get their product shipped?

Process managers and sagas

There are two terms used for these processors/processes. They are process managers and sagas. I don’t want to go through the differences and marketing behind using one or another. I want you to focus on a process that handles an order and that reacts to three events:

  • PaymentTimedOut – occurring when the response for Payment was not delivered before the specific timeout
  • PaymentReceived – the payment was received in time
  • OrderCancelled – the order for which we requested this payment was cancelled

What messages will this process receive and in which order? Before answering, take into consideration:

  • the scheduling system that is used to dispatch timeouts,
  • the external gateway system (that has its own SLA),
  • messaging infrastructure

What’s the order then? Is there a predefined one? For sure there isn’t.

Convergence

Your aim is to provide a convergent process. You should review all the permutations of the process inputs. Being given 3 types of input, you should consider 3! = 6 possibilities. That’s why building a few processes instead of one is easier. You can think of them as sub-state machines that later, can be composed into a bigger whole. This isn’t only a code complexity, but a real mental overhead that you need to design against.

Summary

Designing multiple small processes as sub-state machines is easier. When dealing with many events to react to, try to extract smaller state machines and aggregate them on a higher level.

 

Events visibility vs streams visibility

In my recent implementation of a simple event sourcing library I had to make a small design choice. There are streams which should be considered private. For instance, there’s a process manager position stream, which holds the position change of events already processed by process managers. Its’ events should not be published to other modules, hence, it’d be nice to have an ability to hide them from others. The choice was between introducing some internal streams vs internal events. What would you choose and why?

My choice was to introduce internal events (a simple InternalEventAttribute over an event type). This lets me not only to hide systems’ events but also enables people using this library to hide some, potentially internal data, within the given system/module. The reader can see the gaps in the order number of events in a module stream, but nobody beside the original module can see what was in the event.
As with every tool, it should be used wisely.

Process manager in event sourcing

There is a pattern which can be used to orchestrate collaboration of different aggregates, even if they are located in different contexts/domains. This patters is called a process manager. What it does is handling events which may result in actions on different aggregates. How can one create a process manager handling events from different sources? How to design storage for a process manager?

In the latest take of event sourcing I used a very same direction taken by EventStore. My first condition was to have a single natural number describing the sequence number for each event committed in the given context/domain (represented as module). This, because of using an auto-incrementing identity in a relational database table, even when some event may be rolled back by transaction, has resulted in an monotonically increasing position number for any event appended in the given context/domain. This lets you to use the number of a last read event as a cursor for reading the events forward. Now you can imagine, that having multiple services results in having multiple logs with the property of monotonically increasing positions, for example:

  • Orders: e1, e2, e3, e6
  • Notifications: e1, e2, e3, e4

If a process manager reads from multiple contexts/domains, you can easily come to a conclusion that all you need to store is a last value of a cursor from the given domain. Once an event is dispatched, in terms of finishing handling by the process manager, the cursor value for the context event was created within is updated. This creates an easy but powerful tool for creating process managers with at-least-once process guarantee for all the events they have to process.

If a process provides guarantee of processing events at-least-once and can perform actions on aggregates, it may, as action on aggregate and saving the state of a PM isn’t transactional, perform the given action more than once. It’s easy: just imagine crushing the machine after the action but before marking the event as dispatched. How can we ensure that no event will result in a doubled action? This will be the topic of the next post.