Event stores and event sourcing: some not so practical disadvantages and problems

TL;DR

This post is some kind of answer to the article mentioned in a tweet by Greg Young. The blog post of the author has no comment section. Also, this post contains a lot of information, so that’s why I’m posting it instead of sending as an email or DM.

Commits

Typically, an event store models commits rather than the underlying event data.

I don’t know what is a typical event store. I know though that:

  1. EventStore built by Greg Young company, a standalone event database, fully separates these two, providing granularity on the event level
  2. StreamStone that provides support for Azure Table Storage, works on the event level as well
  3. Marten , a PostgreSQL based document&event database also works on the singular event level

For my statistical sample, the quoted statement does not hold true.

Scaling with snapshots

One problem with event sourcing is handling entities with long and complex lifespans.

and later

Event store implementations typically address this by creating snapshots that summarize state up to a particular point in time.

and later

The question here is when and how should snapshots be created? This is not straightforward as it typically requires an asynchronous process to creates snapshots in advance of any expected query load. In the real world this can be difficult to predict.

The first and foremost, if you have aggregates with long and complex lifespans, it’s your responsibility because you chose a model where you have aggregates like that. Remember, that there are no right or wrong models, only useful or crappy ones.

The second. Let me provide an algorithm for snapshoting. If you retrieved 1000 events to build up aggregate, you should snapshot it (serialize + put into cache in memory + possibly store in a db). Easy and simple, I see no need for fancy algorithms.

Visibility of data

In a generic event store payloads tend to be stored as agnostic payloads in JSON or some other agnostic format. This can obscure data and make it difficult to diagnose data-related issues.

If you as an architect or developer know your domain and you know that you need a strong schema, because you want to use it as published interface but still persist data in JSON instead of some schema-aware serialization like protobuf (binary, schema-aware serialization from Google) it’s not the event store fault. Additionally,

  1. EventStore
  2. StreamStone

both handle binary just right (yes, you can’t write js projections for EventStore, but still you can subscribe).

Handling schema change

If you want to preserve the immutability of events, you will be forced to maintain processing logic that can handle every version of the event schema. Over time this can give rise to some extremely complicated programming logic.

It was shown, that instead of cluttering your model with different versions (which still, sometimes it’s easier to achieve), one could provide a mapping that is applied on the event stream before returning events to the model. In this case, you can handle the versioning in one place and move forward with schema changes (again, if it’s not your published interface). This is not always the case, but this patter can be used to reduce the clutter.

Dealing with complex, real world domains

Given how quickly processing complexity can escalate once you are handling millions of streams, it’s easy to wonder whether any domain is really suitable for an event store.

EventStore, StreamStone – they are designed to handle these millions.

The problem of explanation fatigue

Event stores are an abstract idea that some people really struggle with. They come with a high level of “explanation tax” that has to be paid every time somebody new joins a project.

You could tell this about messaging and delivery guarantees, fast serializers like protobuf or dependency injection. Is there a project, when a newbie joins, they just know what and how to do it? Nope.

Summary

It’s your decision whether to use event sourcing or not, as it’s not a silver bullet. Nothing is. I wanted to clarify some of the misunderstandings that I found in the article. Hopefully, this will help my readers in choosing their tooling (and opinions) wisely.

Dependency rejection

TL;DR

This values must be updates synchronously or we need referential integrity or we need to make all these service calls together are sentences that unfortunately can be heard in discussions about systems. This post provides a pattern for addressing this remarks

The reality

As Jonas Boner says in his presentation Knock, knock. Who’s there? Reality. We really can’t afford to have one database, one model to fit it all. It’s impossible. We process too much, too fast, with too many components to make it all in one fast call. Not to mention transactions. This is reality and if you want to deny reality good luck with that.

The pattern

The pattern to address this remarks is simple. You need to say no to this absurd. NoNoNo.  This no can be supported with many arguments:

  • independent components should be independent, right? Like in-de-pen-dent. They can’t stop working when others stop. This dependency just can’t be taken.
  • does your business truly ensures that everything is prepared up front? What if somebody buys a computer from you? Would you rather say I’ll get it delivered in one week or first double check with all the components if it’s there, or maybe somebody returned it or maybe call the producer with some speech synthesizer? For sure it’s not an option. This dependency just can’t be taken.

You could come up with many more arguments, which could be summarized simply as a new pattern in the town, The Dependency Rejection.

Next time when your DBA/architect/dev friend/tester dreams about this shiny world of a total consistency and always available services, remind them of this and simply reject dependency on this unrealistic idea.

 

Producer – consumer relationship

In the last post about the RampUp library I covered on of the foundations: IRingBuffer. Now I’d like to describe the contract it fulfills.

Producer consumer

If you take a look at the IRingBuffer you’ll see Write/Read methods. These two are responsible for producing/consuming or writing/reading messages to the buffer in FIFO way. What are the guarantees behind such an interface? What about concurrent accessing this structure?

Multimulti

The easiest approach for distinguishing accessing patterns is considering whether the structure could be accessed by one or more threads. If you consider producer/consumer you’ll see that there are four options:

  1. SPSC – Single Producer Single Consumer – only one thread produces items, and another consumes them
  2. MPSC – Multi Producer Single Consumer – multiple threads may produce items in a safe manner, again there’s a single consuming thread
  3. SPMC – Single Producer Multi Consumer – this could be treated as a distributor of work
  4. MPMC – Multi Producer Multi Consumer – multi/multi, ConcurrentQueue is a good example of it

Unfortunately nothing comes for free. If you want to get multi, you’ll pay the price of handling friction on that end. On the other hand, if you want to design a system where queues provide transport between different parts of the system, you’ll need to enable multiple producers for sure, as there’s going to be more than one system element. If you want to process items in an order they appeared and leave the locking issues, just write a fast single threaded code, a single consumer with a single worker thread would be the way to go. Of course this worker thread may access other queues and produce items for them (hence, multi producer is needed).

The ring buffer implementation in RampUp provides exactly MPSC behavior, as it’s prepared to handle items in order, by a single thread.

Composition vs derivation

Assume you’re writing some reports in your application. You’ve just created the third controller covering some kind of reporting and it seems to be, that all the three controllers have a very similar code, modulo type passed as the entity type to your NH ISession.QueryOver() or another data source. It seems that, if the method creating report base was generic, it could be used in all the cases. You want to extract it, make it clearer and to stop repeating yourself. What would you do?

Derivation
The very first thing is to extract method in each of the controllers. Now they seem almost the same. A place is needed where the method can be easily moved. How about a super type? Let’s create a controller, call it in a fashionable way, for instance: ReportControllerBase, and move the method in there. Now you can easily remove the methods in the deriving controllers. Yeah! It’s reusable, everyone writing his/her report can derive from the ReportControllerBase and use its methods to speed up his/her task.

Composition
The very first step is exactly the same: the extract method must be done, to see the common code. Once it’s done, you notice that the whole method has only a few dependencies which can be easily pushed to parameters, for instance: isession, the entity type passed to query, the value used in some complex where clause, etc. You change all the field and properties usages to parameters which allows this method to be static. You create a static method and turn it into an extension method of a session. The refactorization is done, you can easily call this method in all the controllers, by simply calling an extension onto a session.

What’s the difference and why composition should be preferred
If you use C# or Java you should always be aware of one limitation: you can derive from only one type. Spending this ‘once in a lifetime’ for a simple functionality extraction, for me – that’s the wrong way. Using composition, and deriving only when you truly see that one type is another type, that’s the right way to go. In the next post I’ll write about ASP MVC Detached Actions, a simple mechanism you can use, to derive your controllers only, when it is needed and delegating common actions, without it.

ASP MVC Model binders

Recently I had to create a model for ASP MVC, which would be bound by the model binder and had its properties set according to the request. It’s the most common case you can imagine, but there was one ‘but’. The model, because of its nature had to have non-default constructor. It was only a few parameters but it broke the DefaultModelBinder, which simply couldn’t find all the needed values. My response was quite fast, specially having the controller factory implemented in the following way:

/// <summary>
/// The controller factory building up the controllers with the unity container.
/// </summary>
public class ControllerFactory : DefaultControllerFactory
{
    private readonly IUnityContainer _container;

    public ControllerFactory(IUnityContainer container)
    {
        _container = container;
    }

    protected override IController GetControllerInstance(RequestContext requestContext, 
        Type controllerType)
    {
        return (IController)_container.Resolve(controllerType, null);
    }
}

The very next step was to replace the default binder stored in ModelBinders.Binders.DefaultBinder with another, controller-based implementation:

/// <summary>
/// The model binder building up the model objects with the unity container.
/// </summary>
public class ModelBinder : DefaultModelBinder
{
    private readonly IUnityContainer _container;

    public ModelBinder(IUnityContainer container)
    {
        _container = container;
    }

    protected override object CreateModel(ControllerContext controllerContext, 
                                          ModelBindingContext bindingContext,
                                          Type modelType)
    {
        var type = modelType;
        if (modelType.IsGenericType)
        {
            var genericTypeDefinition = modelType.GetGenericTypeDefinition();
            if (genericTypeDefinition == typeof (IDictionary<,>))
            {
                type = typeof (Dictionary<,>).MakeGenericType(modelType.GetGenericArguments());
            }
            else if (((genericTypeDefinition == typeof (IEnumerable<>)) ||
                      (genericTypeDefinition == typeof (ICollection<>))) ||
                     (genericTypeDefinition == typeof (IList<>)))
            {
                type = typeof (List<>).MakeGenericType(modelType.GetGenericArguments());
            }
        }

        // in the base method: return Activator.CreateInstance(type);
        return _container.Resolve(type, null); 
    }
}

As you can compare, the method simply delegates creating model to the container. Having this binder set allowed passing models with non-default constructors, so the problem was solved:] After a while, I considered following case: if the type is constructed with a container, the controller can also be passed an object implementing an interface registered in container. Hence, if a method needs a NHibernate ISession, this can be expressed not only as a parameter in Controller constructor, but also as a method parameter. This paradigm creates controllers with no parameters needed in constructor (or only common parameters, used in all actions) and all the others passed as action parameters. The result is a method, which can be perfectly tested in an environment where the only smallest, needed set of dependencies is passed (no more constructors with not needed in the current method parameters). What do you think about it?

An example of a controller:

public class CarController : Controller
{
    private readonly ISession _session;
    private const int PageSize = 10;

    public HomeController(ISession session)
    {
        _session = session;
    }

    public ActionResult Rent(IUserService userService, Guid carId)
    {
        using (var tx = _session.BeginTransaction())
        {
            var car = _session.Load<Car>(carId);
            var user = userService.GetCurrentUser();

            car.RentBy(user);
            
            tx.Commit();
        }

        return View();
    }

    public ActionResult Index( int? pageNo)
    {
        var page = pageNo ?? 0;

        var cars = _session.CreateQuery("from Car")
            .SetFirstResult(page*PageSize)
            .SetMaxResults(PageSize)
            .Future<Car>();

        return View(cars);
    }
}

Themis wants you to hibernate her!

Recently, I’ve been working on project which in the near future will have a quite complex authorization rules. Additionally, these rules will affect the display, simply filtering data sets, which one can view. Instantly I thought about Themis and it’s ‘future feature’ allowing to integrate with NH. What I’d like to have is simple Themis’ role definition:

public class AnalystRoleDefinition : BaseRoleDefinition
{
    public AnalysRoleDefinition( )
    {
        // CanView - helper method introduced in the application, wrapping Themis': Add blah blah
        CanView<IProtectedDocument>((document, analyst)=> document.ProtectionLevel <= analyst.AllowedProtectionLevel);
    }
}

Ok… As I can see, each analyst is given permission to view protected documents ONLY when he/she has AllowedProtectionLevel greater than document. According to Themis’ functionality it’s simple to create a service configured this way and check, whether in the context of a specified document this permission is granted. But what about filtering? If the analyst is disallowed to view such a document, shouldn’t it be hidden?

NHibernate filters to the rescue

NHibernate filters can be used to add specific conditions to querying classes and their collections. Being given the role definition mentioned earlier I could add filter with an easy condition (the analyst property would be a parameter got from the context, the real condition is based on a document ProtectionLevel column) for each class implementing the IProtectedDocument and activate them with one call based on the context. My very first proposal for API would be:

var roles = _roleService.GetCurrentUserRoles();
using(var filter = _nhAuthorizationService.ApplySessionFilters(_session, roles)
{
    var q = _session.CreateQuery("from IProtectedDocument");
    // other query stuff
}

The filter wrapped in using is an applier of filters which uses ISession.EnableFilter during creation and ISession.DisableFilter to undo filtering. It leave the session in a untouched filtering state.

If you bothered with explicitness and you want something implicit, you can easily add this behavior to your DI and never thing about it again.

Any thoughts about it?

Domain events with Unity Container extension

I do like domain events perfectly described by Udi. I’d like to share my implementation of this pattern, using Unity (my preferable container), which seems to be simple, short and still powerful. I assume, that you read the Udi’s article carefully. First and foremost: the domain interfaces

    /// <summary>
    /// The markup interface for any system event.
    /// </summary>
    public interface IEvent
    {
    }

    /// <summary>
    /// The interface of a handler for events of type <typeparamref name="T"/>.
    /// </summary>
    /// <typeparam name="T">The type of the event to handle.</typeparam>
    public interface IEventHandler
        where T : IEvent
    {
        /// <summary>
        /// Handles the specified @event.
        /// </summary>
        /// <param name="event">The @event.</param>
        void Handle(T @event);
    }

    /// <summary>
    /// The interface of event manager, allowing raising events based on the <see cref="IEvent"/>.
    /// </summary>
    public interface IEventManager
    {
        /// <summary>
        /// Raises the specified event, not requiring, 
        /// that it is handled by any handler.
        /// </summary>
        /// <typeparam name="T">The type of the event.</typeparam>
        /// <param name="event">The @event.</param>
        /// <remarks>
        /// The method iterates through all of the registered handlers for the event of type <typeparamref name="T"/>.
        /// </remarks>
        void Raise(T @event)
            where T : IEvent;
    }

I assume usage of dependency injection and having the IEventManager implementation injected. Speaking about the IEventManager implementation it’s fairly simple, located in a project knowing the Unity assembly (Infrastructure in my case). It’s worth to mention, that I consider handlers to be transient objects, constructed and discarded immediately after event handling.

    /// <summary>
    /// The implementation of event manager using the container to resolve handlers.
    /// </summary>
    public class EventManager : IEventManager
    {
        private readonly IUnityContainer _container;

        [DebuggerStepThrough]
        public EventManager(IUnityContainer container)
        {
            _container = container;
        }

        [DebuggerStepThrough]
        public void Raise(T @event)
            where T : IEvent
        {
            var handlers = _container.&lt;ResolveAll&gt;();

            foreach (var handler in handlers)
            {
                try
                {
                    handler.Handle(@event);
                }
                finally
                {
                    _container.Teardown(handler);
                }
            }
        }
    }

The last but not least: how to register all the handlers in the container? I wrote a small unity extension providing assembly scan functionality which can be found below. The are two mouthful method names in the code:

  • GetFirstClosedGenericInterfaceBasedOnOpenGenericInterface, which tries to do what is written it does 😛
  • RegisterInstanceWithSingletonLifetimeManager,an extension methods using a I-have-nothing-to-do-with-monitors LifetimeManager for setting up the singletons

The code itself:

    /// <summary>
    /// The <see cref="UnityContainer"/> extension registering all the event handlers
    /// and unity-based implementation of <see cref="IEventManager"/>.
    /// </summary>
    /// <remarks>
    /// The event handlers are regsitered as classes with transient lifetime. 
    /// Their instances are tear down immediately after usage.
    /// </remarks>
    public class EventUnityContainerExtension : UnityContainerExtension
    {
        protected override void Initialize()
        {
            var eventManager = new EventManager(Container);
            Container.RegisterInstanceWithSingletonLifetimeManager(eventManager);
        }
        /// <summary>
        /// Registers all the event handlers from assembly of the passed type <typeparamref name="T"/>.
        /// </summary>
        /// <typeparam name="T">The type which assembly should be scanned.</typeparam>
        /// <returns>This instance.</returns>
        public EventUnityContainerExtension RegisterHandlersFromAssemblyOf<T>()
        {
            return RegisterHandlersFromAssembly(typeof(T).Assembly);
        }

        /// <summary>
        /// Registers all the event handlers from all <paramref name="eventHandlersAssemblies"/>.
        /// </summary>
        /// <param name="eventHandlersAssemblies">List of assemblies to scan.</param>
        /// <returns>This instance.</returns>
        public EventUnityContainerExtension RegisterHandlersFromAssemblies(params Assembly[] eventHandlersAssemblies)
        {
            foreach (var eventHandlersAssembly in eventHandlersAssemblies)
            {
                RegisterHandlersFromAssembly(eventHandlersAssembly);
            }

            return this;
        }

        private EventUnityContainerExtension RegisterHandlersFromAssembly(Assembly assembly)
        {
            foreach (var type in assembly.GetTypes())
            {
                if (type.IsInterface || type.IsAbstract)
                {
                    continue;
                }

                var handlerInterface = type.GetFirstClosedGenericInterfaceBasedOnOpenGenericInterface(typeof(IEventHandler));
                if (handlerInterface == null)
                {
                    continue;
                }

                // the type is registered with a name, because only named registrations can be resolved by unity.ResolveAll method
                Container.RegisterType(handlerInterface, type, type.FullName, new TransientLifetimeManager());
            }

            return this;
        }
    }
}

Simple and powerful, isn’t it? 🙂