Snowy identifiers

TL;DR

When using the snowflake entities pattern, it’s quite easy to forget about using external identifiers that we need to communicate with external systems. This post provides an easy way to address this concern.

Identity revisited

The identifier of a snowflake entity was presented as a guid. We use an artificial non-colliding client-generated identifier to ensure, that any part of the system can generate one without validating that a specific value hasn’t been used before. This enables storing different pieces of data, belonging to different contexts in different services of our system. No system leaves in vacuum though, and sometimes it requires communication with the rest of the world.

Gate away!

A common aspect that is handled by an external system are payments. When you consider credit cards, native bank applications, PayPay, BitCoin and all the rest, providing that kind of a service on your own is not a reasonable option. That’s why external services are used – the price of using one is much cheaper than delivering one. Let’s stick to the payments example. How would you approach this? Would you call the external payment service from each of your services? I hope you’d not. A better approach is to create a gateway, that will act as a translator between your system and the external one.

How many ids do I need?

Using a gateway provides a really interesting property. As the payment gateway is a part of your system, it can use the snowflake identifier. In other words, if there’s an order, it’s ok (under given circumstances) to use its identifier as identifier of the payment as well. Of course if you want to model these two as a part of a snowflake entity spanning across services. It’d be the payment gateway responsibility to correlate the system snowflake identifier with the external system id (integer, some string, whatever). This would create a coherent view of an entity within your system boundaries, closing the mapping in a small dedicated area of the payment gateway.

An integration with an external system closed in a small component leaving your system agnostic to this? Do we need more?

Summary

As you can see, closing the external dependency as a gateway provides value not only by separating the interface of the external provider from your system components, but also preserves a coherent (but distributed) view of your entities.

Snowflake entities

TL;DR

Much to often we incorporate the fallacy of grasping it all the next time. We say, that now, after absorbing some knowledge we will KNOW it all and design the system THE RIGHT way. Unfortunately, even using good methods like the strangler pattern, it’s simply impossible to design in a right way that covers the WHOLE domain of a company. Are there any design or architectural patterns that can be helpful? It there a way to make the uncertainty play by our rules?

Identity

The most important part of any design is an identity. I don’t mean the identity from the identity management point of view. I mean the identity that makes the entities, aggregate roots different, the good old-fashioned Id. It’s quite to common to find entities’ properties pointing to another context/domain. A User will have an employee’s identifier, a car in an insurance company will be referenced in accounting and other contexts. Basically, it will spread its carId across the whole company. Have you ever encountered ThisExternalId in your system? I bet you have. It’s time to end this.

One thing, multiple views

It’s not a car that is referenced in the accounting. It’s not a user that is referencing an employee. It’s the same entity spread across different contexts. Let me give you an example.

The same entity, depending on the context, can have a different meaning. A car in an insurance company will be seen in different dimensions. In some contexts it won’t occur at all (mortgage insurance), in others, it will be present. What’s the MAIN context? Can one tell what is it? Again, I bet noone can.

It’s time to end this referential wars. A car is not referenced by this or that context. None of them requires to have a car identifier as a foreign key. Everyone requires to have an id which can have the SAME VALUE in different context. Why? Because the same thing will be understood differently in different contexts! When signing an insurance, it will be just a thing. When paying for an accident it will be another, but after all, it’s the same car.

The same thing can be seen in many contexts. In each of them, it will have a separate set of properties that is unique, but will share one and only one property – the identity. The identity will be probably artificial and uniquely generated (you don’t want to have duplicates, do you?). A perfect match for that would be GUID or UUID.

Snowflakes

When asking what a car is, you could imagine asking all the contexts for the same identifier. Getting an insurance info

GET /insurances/38e5c55b-1b44-4bdc-bd9e-632580736f22

Getting the mailing info

GET /mailing/38e5c55b-1b44-4bdc-bd9e-632580736f22

And so on and so forth. Getting an empty response from a context means nothing but the fact, that an entity does not exist in a specific context. You can see this as a snowflake, which consists of the center which is an identifier, and arms which are responses from different contexts. None of the contexts creates the entity on its own, but every single one contains a meaningful information about the snowflake.

Further modelling

It’s quite easy to follow this rule and imagine adding another context to a system like this. You don’t have to add foreign keys or make changes to other components. You just map a new context with the same bluntly simple rule: share just the identity. After all, if another context wants to handle this entity in some way, it will just use this single meaningful property of an entity, the identity of the snowflake.

I hope that this article shed some light on this modelling technique and showed itself as a solid and an extensible approach towards modelling your domain.

Snapshot time!

It’s snapshot time! There’s been a lot of event sourcing content so far. Let’s do a recap!

Below you will find a short summary of event sourcing related articles that I have published here so far. Treat it as a table of content or a lookup or a pattern collection. It’s ordered by date, the later – the older. Enjoy!

  1. Why did it happen – how to make your event sourced system even easier to reason about
  2. Event sourcing and interim stream – how to embrace new modelling techniques with short living streams
  3. Multitenant Event Sourcing with Azure – how to design a multitenant event sourced system using Azure Storage Services
  4. Rediscover your domain with Event Sourcing – how to use your events and astonish your business with meaningful insights
  5. Event Sourcing for DBAs – a short introduction for any relational person into the amazing world of event sourcing. Can be used as an argument during a conversation.
  6. Enriching your events – what are events metadata and why should we care? how to select the most important ones
  7. Aggregate, an idempotent receiver – how to receive a command or dispatch an event exactly once?
  8. Process managers – what is a process manager, how can you simplify it?
  9. Optimizing queries – how to make queries efficient, especially when dealing with multiple version of the same application running in parallel
  10. Event sourcing and failure handling – an exception is thrown. Is it an event or not? How to deal and model it?
  11. Embracing domain leads towards event oriented design – how event oriented design emerges from understanding of a domain

Event Sourcing and interim streams

TL;DR

When modelling with event sourcing, people often tend to create long living streams/aggregates. I encourage you to improve your modelling with interim streams.

Long live the king

A user, an account, a company. Frequently this kinds of aggregates are distinguished during first modelling attempts. They are long lasting, never ending streams of events. Sometimes they combine events from different contexts, which is a smell on its own. The extended longevity can be a problem though and even when events are correlated and an aggregate is dense, in the business meaning, rethinking it’s lifetime can help you design a better system.

Interim streams

One of the frequent questions raised when applying event sourcing is: how do I ensure that a user is registered with a unique email? For sure in majority of cases handling duplicates by a person would be just enough (how many people would actually try to register twice…), but let’s try to make this a real business requirement.

Because of the disability to span a transaction across different streams, we can model this requirement in a different way. Let’s start with creation of a stream which name is derived from email, like:

useraggregate_email@domain.com

This would ensure, that for a specific email, only one aggregate can be created. Next step is to put an event holding all the needed data to add a user, like:

UserRegistered:

{name: “Test”, surname: “Test”, phone: “111111111111”, userid: “some_guid”}

The final step is to have a projection that consumes this event and appends it to a new stream named:

useraggregate_some_guid

This ensures that we don’t use a natural key and still preserve the uniqueness requirement. At the end we have an aggregate that can be identified with its GUID.

Time to die

The interim stream example with the unique email wasn’t the best one, but was good enough that a stream can make interim streams or that can be based on an interim stream. It’s vital to liberate from thinking that every stream needs to live forever. I’d say that in majority of cases there’s a moment when a stream needs to ends its life.

stocksnap_6qj9xjwicj

The longevity isn’t the most important virtue of a stream. It’s created either to model a behavior in a right way or be just a passage for other streams. Next time when you model a stream ask yourself, how long will it live?

Rediscover your domain with Event Sourcing

TL;DR

Beside common advantages of event sourcing like the auditing, projections and sticking closely to the domain, you can use events to discover the domain again and provide meaningful insights to your business.

Include.Metadata

I’ve already described the idea of enriching your events. This is the main enabler for analyzing your events in various way. The basic metadata one could are:

  • date
  • action performer
  • on behalf of who action is taken

You could add a screen of your app and IP address and many many more.

Reason

Having these additional data, it’s quite easy to aggregate all the events of a specific user. This, with time attached, could provide various information:

  • how the work is distributed during a work day
  • how big is the area of business handled by a single user
  • is the user behavior pattern the same all the time or maybe somebody has overtaken this account?

The same with projection by event type:

  • is it a frequent business event
  • are these event clustered in time – maybe two events are only one event

Or looking at a mixed projection finding sequences of events for a user that might indicate:

  • an opportunity for remodeling your implementation
  • finding hot spots in the application.

Rediscover

stocksnap_crlx1ud87t

All of the above may be treated as simple aggregations/projections. On the other hand, they may provide important trends for a system and might be used to get an event based insight to the business domain. Can you imagine the business being informed about a high probability of a successful cross selling of two or three products? That’s where a competitive advantage can be born.

 

 

Relaxed Optimistic Concurrency

TL;DR

When using the optimistic concurrency approach for entities that are updated frequently, some of the actions may fail because of the conflicting version numbers. A proper modelling technique distilling if business requirements can be loosened may greatly increase the chances of succeeding with commands issued against these entities improving overall performance of an application and a lowering a probability of errors.

Optimistic Concurrency

The optimistic concurrency is an approach for ensuring non overlapping updates over a given entity. It’s supported by the majority of heavy ORMs and applied simply by adding a conditional where at the end of the update. For example

UPDATE Orders
-- more updated columns
SET version = @version + 1
WHERE id = @id AND version = @version

This approach ensures, that if any other operation updated the entity in the meantime, this update will fail. Additionally, if an ORM is capable of counting rows that should have been updated, like NHibernate does, it can abort a transaction and throw an exception informing that some of the operations that were planned to be executed failed.

The optimistic concurrency approach is not a unique SQL technique. It’s popular in many NoSql databases like Azure Table Storage for example. When updating an entity, its ETag is added as the If-Match header, ensuring, that if the entity was modified after retrieval and updated, the operation, again, will fail. See Update operation documentation here.

Finally, when applying Domain Driven Design and operating on an Aggregate Root, this technique is the easiest one to ensure, that the aggregate root is truly a transaction boundary. If the root has its version updated with every change of the aggregate, then two concurrent operations cannot be executed and one will fail, still, preserving the root as a transaction boundary. This applies to aggregate roots, no matter if you immerse them into Event Sourcing or a regular ORM mapped graph of entities. Just update the root with every operation and your aggregate will be just fine.

As it’s been shown above, optimistic concurrency is a simple and powerful tool that in a world of NoSql and transactional-boundaries-got-right may be the only one to ensure atomicity of operations.

Limitations

When using optimistic concurrency, the flow of applying a change is a bit different. Instead of just updating a property, or a value, the following approach is taken

  1. An aggregate is retrieved with its version
  2. If the state allows it, a command is executed
  3. The aggregates’ state is updated conditionally (if the version is unchanged)

Again, this ensures that the updated is applied on the version that a business logic operated onto, but limits the concurrent access.

For services using Event Sourcing, instead of retrieving entity all of the events are retrieved and a state of an aggregate is rebuilt. If snapshots are used, only events with versions bigger than a snapshot must be retrieved. If the snapshot is preserved in a in memory cache, then possibly, no events will be retrieved if the snapshot’s version is equal to the number of aggregate’s events so far. Events that are a result of a command are appended to the store conditionally. Depending on the storage it can be the stream version when using EventStore AppendToStreamAsync or update of a root markup entity when using a custom relational store.

An example

Let’s consider an example of a GitHub-like issue. Every issue has an option of locking it. It can be used for instance to lock an issue created by a troll (you don’t feed the troll) and disallow adding more comments. For sake of argument:

  1. let’s model all comments as a part of the issue aggregate (as always, there are many models that can be applied)
  2. optimistic concurrency is used for all commands.

A business requirement for locking an issue could look like:

when an issue is locked no user should be able to add more comments

It’s quite common, that when seeing a requirement like this, developers don’t ask questions. It’s even more unfortunate, that some companies require to just follow the analysis. Let’s try to relax this requirement a little bit by asking some questions:

  1. Is it required to lock the issue immediately?
  2. Could an issue be considered locked after some short period of time (less than 1s) after locking it?
  3. Could we allow adding some comments during this period?

If the answers point towards no need of an immediate lock, there’s a space to handle locking in a relaxed manner

Relaxed Optimistic Concurrency

If an operation can have its preconditions relaxed and can be performed after achieving some state it can be executed with much less friction. In the previous example, the state when a user can add a comment is a created issue. The precondition is a non-locked issue, but it’s ok to add a comment to a locked issue within some time boundaries. Consider the following flow

  1. An aggregate is retrieved with its version
  2. If the state allows it, a command is executed
  3. The aggregates’ state is updated conditionally (if the version is unchanged) appending the change unconditionally

Depending on the storage and the applied design in can be done in many ways.

When using Event Sourcing with EventStore a special version can be passed to the appending method which represents any version. This appends events unconditionally. This means that a locking operation and adding a comment can be done in parallel without conflicts!

When using a relational database, an issue entity can be retrieved to check it’s state. Next, a comment entity can be added separately, without updating the version of the issue itself. Again, because adding a comment does not change the version, the friction on the aggregate is lowered.

Summing up

Don’t take requirements for granted, but rather ask for the reasoning behind them. Try to relax requirements for areas which may suffer from the high contention. The model is just a model. There are no true or false models but these which help you or make your work harder. Choose wisely🙂

Events on the Outside versus Events on the Inside

Recently I’ve been revisiting some of my Domain Driven Design, CQRS & Event Sourcing knowledge and techniques. I’ve supported creation of systems with these approaches, hence, I could revisit the experiences I had as well. If you are not familiar with these topics, a good started could be my Feed Your Head list.

Inside

So you model you domain with aggregates in minds, distilling contexts and domains. The separation between services may be clear or a bit blurry, but it looks ok and, more important, maps the business well. Inside a single context bubble, you can use your aggregates’ events to create views and use the views when in need of data for a command execution. It doesn’t matter which database you use for storing events. It’s simple. Restore the state of an aggregate, gather some data from views, execute a command. If any events are emitted, just store them. A background worker will pick them up to dispatch to a Process Manager.

Outside

What about exposing you events to other modules? If and how can another module react to an event? Should it be able to build it’s own view from the data held in the event? All of these could be sum up in one question: do external events match the internal of a specific module? My answer would be: it’s not easy to tell.

In some systems, these may be good. By the system I mean not only a product, but also a team. Sometimes having a feed of events can be liberating and enabling faster grow, by speeding up initial shaping. You could agree to actually separate services from the very start and verify during a design, if the logical complexity is still low. I.e., if there is not that much events shared between services and what they contain.

This approach brings some problems as well. All the events are becoming your API. They are public, so now they should be taken into consideration when versioning your schemas. Probably some migration guide will be needed as well. The bigger public API the bigger friction with maintaining it for its consumers.

Having this said, you could consider having a smaller and totally separate set of events you want to share with external systems. This draws a visible line between the Inside & the Outside of your service, enabling you to evolve rapidly in the Inside. Maintaining a stable API is much easier then and the system itself has a separation. This addresses questions about views as well. Where should they be stored originally. The answer would be to store properly versioned, immutable views Inside the service, using identifiers to pass the reference to another service. When needed, the consumer can copy & transform the data locally. A separate set of events provides ability to do not use Event Sourcing where not needed. That kind of options (you may, but don’t have to use it) are always good.

For some time I was an advocate of sharing events between services easily, but now, I’d say: apply a proper choice for your scenario. Consider pros and cons, especially in terms of the schema maintainer tax & an option for not sticking to Event Sourcing.

Inspirations

The process of revisiting my assumptions has been started by a few materials. One of them is a presentation by Andreas Ohlund, ‘Putting your events on a diet’, sharing a story about deconstructing an online shop into services. The second are some bits from A Decade of DDD, CQRS, Event Sourcing by Greg Young. The last but not least, Pat Helland’s Data on the Outside versus Data on the Inside.