Out of order commands

In the previous posts a simple mechanism of storing information needed for operation idempotence was introduced. A simple hash table, which state is transactionally saved with the state of object onto which the send operation was applied. How about receiving operations out of order? What if infrastructure (for instance, messaging system) will pass one operation earlier than the second, which in reality occurred earlier?

It’s time to make it explicit and start calling elements in the DDD manner. So for sake of reference, the object considered as the subject of an operation is an aggregate root. The operation is of course a message. The modeling assumes using the event sourcing as a storage for aggregates’ states.

Assume, that the aggregate, which the command is sent to, has a property called Version, incremented with each event applied on. Assume then, each command contains a version number, which is supposed to be equal to the aggregate’s version. If, during dispatch, these two values are different, an exception is thrown and command do not change the state of the aggregate. It’s a simple optimistic concurrency implementation, allowing discarding out of order commands sent to an object.

To make it more interesting, consider a sharded system, where specific aggregates are stored by different nodes (but for each aggregate there is one node where it is stored). An aggregate’s events (state changes) have to be propagated across all the nodes/shards in the same idempotent manner as commands are sent to aggregates. It’s easy to apply hashtable for each node and with using the very same key: aggregateId with version but it would mean storing all the pairs of aggregate identifiers with their versions, which could possibly bring down each of your nodes (or make you use GBs of memory). Can the trivial fact, that version is increased with every event on the aggregate, could be used for some optimization? You’ll see in the next entry.

What am I missing here?

If you’re in a startup and have a full-time job a the same moment as I do, that’s a post for you.

The initial startup pressure and tempo is huge. Focused on the features you can bring to life more and more of them. How often do you load your project, collapse it’s whole structure and ask questions:

  • What am I doing now?
  • How does it influence the rest of the system?
  • Is everything I need expressible in the current infrastructure and/or design?
  • Is it something, which I know from other projects missing?

It might seem that those opened questions are unneeded, to silly to ask, but from last time I asked them, they became a weekly routine. To show you, I’ll give you an example.

I write tests. As you already know, not always unit tests, but… During one of my write test/run/fix error cycles I noticed that it was quite hard to get all the information I needed. There was an assert failing and without debugging, only by viewing logs I had no idea what might have gone wrong. I reopened the project and did ‘what am I missing here?’ After global review of the whole solution I did found a thing. During all the feature based design I did a silly mistake not providing any logging in the application. You know, these _if_log_isDebugEnabled_ stuff (take a look in the NHibernate code). It took me no more than 10 minutes to spike it with some console appender and I rerun my tests. Ha! Some components did not log one or two operations and that was it.

It’s worth not to loose the (overused phrase) big picture and from time to time, stop providing features and ask these silly, ordinary questions.

Features and unit tests

Currently I’m involved in a greenfield project, which was happy enough to start as a event-sourced DDD. The paradigm fits well the domain, but there is some cost of zero-iteration which is being paid now. The cost is a small infrastructure which has to be provided to be wrap the domain. It’s really small (maybe not that small), but has to be tested. To make it simple/complex I’d add that it consists of a few functionalities/modules, each providing and consuming some contracts. So what about testing?

Recently I re-read a Kozmic’s blog entry, which covers a very interesting problem of unit tests vs ‘real life scenarios’ as well as ‘Concepts and features’ written by Ayende. It made me think a lot about the new application and the way I tried to test it. I took the first module, which had a small test base written in a not-good-as-it-should-be manner, deleted the tests and started to implemented sth new.

I know that some people have opinion ‘no container in your tests’, but in tests of the infrastructure? Come on, this is all about matching all the pieces of the code you wrote! Nevertheless, having the WindsorInstaller of a specific part of the application I initialized the container, added one stub for the external dependencies and fully configured this part of the application. All the tests used the enpoints provided by the whole functionality and it worked like a charm. It seems that the number of test fixtures will be equal to the number of functionalities/modules the infrastructure provide. I find it a quick, simple and powerful solution for having it testes. What about the unit tests? If I have some infrastructure class longer than 100 lines and plenty of methods in there, I will go for unit tests, but for now, this paradigm works like a charm.

Idempotence, pt. 2

In the previous post a few operations were taken into consideration, whether there are (not) idempotent. For the sake of reference, here there are:

  • Marked as default
  • Money transfer ’500$’ ordered to ‘x’ account
  • Label ‘leave sth for the future month’ added

If we consider ‘idempotent’ as an operation which can be applied multiple times in a row, then all the operations overriding previous values of some properties are idempotent. Having some entity marked as default 5 times does not change the fact that it is default. That’s for sure. What about provisioning ‘x’ account with 500$? Can this type of operation can be reapplied multiple times? Of course not, because it does not override any property, it changes the state, by interacting with a previous one. The same goes for ‘labeling’, of course if there is no compensation introduced (select only unique labels before saving, which would allow reapplying).

What if you want your system to be resistant to operations resend multiple times? The simplest solution is to add unique identifier for each operation and storing them is a lookup (hashtable). Each time the operation arrives, the lookup is checked whether there this operation was already processed. If so, skip it.

There is one additional condition is to have the lookup transactional with a storage you save the states. This condition is a simple ‘all-or-none’ for storing the result of operation with the fact, that this specific operation was already applied. Otherwise, if lookup would be updated in the first place and storing the state after the operation failed, there would be no change saved. The same applies to a situation, where the lookup is updated at the very end. The operation result is saved, adding info about operation to lookup fails and the next time the same operation arrives it is applied one more time. Having that said, lookup must be transactional with the medium where state is saved.

Idempotence

Recently, I’ve been reading a lot of dddcrqs group. The reason was a project I’m writing, which will be implemented in a DDD paradigm using CQRS and the event sourcing. As I read about about them, I found over one year old Greg’s post considering pros and cons of idempotence. The definition of idempotence is pretty straight forward. We call an operation idempotent iff:
f(f(x)) = f(x)
which simply means, that applying the same function multiple times brings the same result. In mathematics, you can find a constant function, giving the same result for each argument, hence, during multiple application. That’s the mathematician point of view. What about computer science?
It can be easily brought to the DDD field, especially, the event-sourced implementation of it. For those who don’t know what the event sourcing is, take a look here. Consider the following chain of events for a client’s bank account:

  • Marked as default
  • Money transfer ’500$’ ordered to ‘x’ account
  • Label ‘leave sth for the future month’ added

Which of them would you consider idempotent and under which conditions?