Events on the Outside versus Events on the Inside

Recently I’ve been revisiting some of my Domain Driven Design, CQRS & Event Sourcing knowledge and techniques. I’ve supported creation of systems with these approaches, hence, I could revisit the experiences I had as well. If you are not familiar with these topics, a good started could be my Feed Your Head list.


So you model you domain with aggregates in minds, distilling contexts and domains. The separation between services may be clear or a bit blurry, but it looks ok and, more important, maps the business well. Inside a single context bubble, you can use your aggregates’ events to create views and use the views when in need of data for a command execution. It doesn’t matter which database you use for storing events. It’s simple. Restore the state of an aggregate, gather some data from views, execute a command. If any events are emitted, just store them. A background worker will pick them up to dispatch to a Process Manager.


What about exposing you events to other modules? If and how can another module react to an event? Should it be able to build it’s own view from the data held in the event? All of these could be sum up in one question: do external events match the internal of a specific module? My answer would be: it’s not easy to tell.

In some systems, these may be good. By the system I mean not only a product, but also a team. Sometimes having a feed of events can be liberating and enabling faster grow, by speeding up initial shaping. You could agree to actually separate services from the very start and verify during a design, if the logical complexity is still low. I.e., if there is not that much events shared between services and what they contain.

This approach brings some problems as well. All the events are becoming your API. They are public, so now they should be taken into consideration when versioning your schemas. Probably some migration guide will be needed as well. The bigger public API the bigger friction with maintaining it for its consumers.

Having this said, you could consider having a smaller and totally separate set of events you want to share with external systems. This draws a visible line between the Inside & the Outside of your service, enabling you to evolve rapidly in the Inside. Maintaining a stable API is much easier then and the system itself has a separation. This addresses questions about views as well. Where should they be stored originally. The answer would be to store properly versioned, immutable views Inside the service, using identifiers to pass the reference to another service. When needed, the consumer can copy & transform the data locally. A separate set of events provides ability to do not use Event Sourcing where not needed. That kind of options (you may, but don’t have to use it) are always good.

For some time I was an advocate of sharing events between services easily, but now, I’d say: apply a proper choice for your scenario. Consider pros and cons, especially in terms of the schema maintainer tax & an option for not sticking to Event Sourcing.


The process of revisiting my assumptions has been started by a few materials. One of them is a presentation by Andreas Ohlund, ‘Putting your events on a diet’, sharing a story about deconstructing an online shop into services. The second are some bits from A Decade of DDD, CQRS, Event Sourcing by Greg Young. The last but not least, Pat Helland’s Data on the Outside versus Data on the Inside.

Single producer single consumer optimizations

The producer-consumer relationship is one of the most fundamental cooperation patterns. Some components produce values, issues requests and some consume/handle them. Depending on the number of components at the end of this dependency it’s called ‘single/multi producer single/multi consumer’ relationship. It’s important to make this choice explicit, because as with every explicit choice, it enables some optimizations. I’d like to share some thoughts o the optimizations taken in the single consumer single producer scenario in the RampUp library provided by OneToOneRingBuffer.

The behavior of ring buffers in RampUp is ported from Java’s Agrona. They provide a queue that enables reading sequentially on the consumer side. The reasoning behind it is that sequential reads are CPU friendly, so that consumer can process messages much quicker. For ManyToOneRingBuffer the production part is quite complex. It proceeds as follows:

  1. check against the consumer position, is there enough of space
  2. allocate a slot in the ring (this is done with Interlocked operations, in a loop, may take a while)
  3. write a header in an ordered way (using volatile)
  4. put data
  5. write the header again marking the message as published

This brings a lot of unneeded work for a single producer. When considering a single producer, there’s nothing to compete with. The only check that needs to be made is that the producer does not overlap with the consumer. So the algorithm looks as follows:


  1. check against the consumer position, is there enough of space
  2. put data
  3. write the header again marking the message as published
  4. write the tail value for future writes

Removal of Interlocked and lowering the number of Volatile operations can improve the producer performance greatly (less synchronization).


If you wanted to compare these two on your own, here you are: ManyToOne and OneToOne.

Happy producing (and consuming).

Data has no format

I need to be able to store 1GB of JSON

I’d like to push XML 100 MB/s to this Azure blob

I need to log this data as CSV

Statements like this are sometimes true, but in the majority of cases the format is not given and is a part of designing your architecture/application. Or redesigning if needed. Selecting a proper format can lower the size of your data, increasing the throughput of your system, if a medium like a disk or a network is saturated. That’s why systems like Apache Arrow or Google’s Dremel use their own formats. That’s why you may consider using the protobuf-net serialization for EventStore, disabling it build in v8 projections and lowering size of events at the same time. For low latency systems you can choose the new library Simple Binary Encoding. That’s why sometimes storing data in another format is simply better. I’ve written a blog post Do we really need all these data tranformations and it doesn’t state something opposite. It’s all about making a rational and proper choices of the storage format and taking into consideration different aspects of it and its influence on your system. With this one decision you might improve your system performance greatly.

Shared Resources in TeamCity

It’s a common requirement that a set of your tests depends on some resources. It might be a database or an Azure Storage account. It’s possible that instead of providing TeamCity with an administrator account (giving a subscription access for Azure) you’d prefer to have a limited preexisting set or resources like databases or Azure Storage accounts that are leased for a build time by a particular agent. As soon as build is finished the resource would go back to the pool to be leased for another build.

Fortunately TeamCity has a built in ability for this purpose called Shared Resources. This can be defined on any project level and used as a parameter of any build configuration below. Shared Resources feature provides you with all the capabilities mentioned before, removing all the burden of managing a resource pool. In the same way a build leases an agent, an agent leases a shared resource. Nice, simple, easy.