Why persistent memory will change your world

TL;DR

If you haven’t heard, the non volatile RAM memory is coming to town and for sure will change the persistence patterns of databases, queues, loggers. Want to know more about this new wave of hardware? Read along.

API

The first and the most important aspect is that persistent memory on Windows, reuses already existing APIs. First, if you want to use the drive just as a block storage, you can do it. You’ll be able to create files, write them etc. There’s another way of using it, which is much faster, called DAX.

The direct access enables to use the non volatile memory directly. What do I mean by directly? I mean accessing the memory with a raw pointer. How do you obtain a pointer? The old fashioned memory mapped file API is used. First, create a file, then map it and here it is! No FlushFileBuffers, no fsync. Just a raw pointer to the memory. Can you imagine writing to a mapped file and just having it persisted?

Speed

The non volatile memory is really fast. You can write 4GB per second. Yes, it’s 4GB per second of a persistent memory. The latency is extremely low. It’s so low, that using any form of asynchronous programming (raw completion ports, async-await) brings more havoc than just waiting for having this memory written. Yes, this means that your methods hitting files mem mapped with DAX will not need async signatures. Of course you’ll be able to preserve them just for the compatibility.

Ordering matters

It looks like tech heaven. No more data loss during power outages, right? It’s not entirely true. The persistent memory acts as a memory. There is an order of execution in which data are transported there. Imagine now writing the following string:

BLAH

If power went down after copying first three letters you’d be left with

BLA

Which although shows the same attitude, is not what we wish for when thinking about persistence. This example shows, that good old fashioned IO access patterns will still be important, like wall ahead logging or copy-on-write. But let me remind you again, they will be free with no synchronization, no flushing required.

Adopters

The persistent memory will change the world. SQL Server 2016 has already adopted it, as you can see here. Some databases are already there, like LMDB using memory mapped files (same API for the win!) with an ability to run as non-durable. Guess what. Now it’s durable. More databases will follow.

Summary

The persistent memory is here. You won’t probably rewrite or rethink your application as majority of apps do not deal with IO directly, but just by applying it to your database or other IO bound systems will be a real game changer.

Snowy identifiers

TL;DR

When using the snowflake entities pattern, it’s quite easy to forget about using external identifiers that we need to communicate with external systems. This post provides an easy way to address this concern.

Identity revisited

The identifier of a snowflake entity was presented as a guid. We use an artificial non-colliding client-generated identifier to ensure, that any part of the system can generate one without validating that a specific value hasn’t been used before. This enables storing different pieces of data, belonging to different contexts in different services of our system. No system leaves in vacuum though, and sometimes it requires communication with the rest of the world.

Gate away!

A common aspect that is handled by an external system are payments. When you consider credit cards, native bank applications, PayPay, BitCoin and all the rest, providing that kind of a service on your own is not a reasonable option. That’s why external services are used – the price of using one is much cheaper than delivering one. Let’s stick to the payments example. How would you approach this? Would you call the external payment service from each of your services? I hope you’d not. A better approach is to create a gateway, that will act as a translator between your system and the external one.

How many ids do I need?

Using a gateway provides a really interesting property. As the payment gateway is a part of your system, it can use the snowflake identifier. In other words, if there’s an order, it’s ok (under given circumstances) to use its identifier as identifier of the payment as well. Of course if you want to model these two as a part of a snowflake entity spanning across services. It’d be the payment gateway responsibility to correlate the system snowflake identifier with the external system id (integer, some string, whatever). This would create a coherent view of an entity within your system boundaries, closing the mapping in a small dedicated area of the payment gateway.

An integration with an external system closed in a small component leaving your system agnostic to this? Do we need more?

Summary

As you can see, closing the external dependency as a gateway provides value not only by separating the interface of the external provider from your system components, but also preserves a coherent (but distributed) view of your entities.

Snowflake entities

TL;DR

Much to often we incorporate the fallacy of grasping it all the next time. We say, that now, after absorbing some knowledge we will KNOW it all and design the system THE RIGHT way. Unfortunately, even using good methods like the strangler pattern, it’s simply impossible to design in a right way that covers the WHOLE domain of a company. Are there any design or architectural patterns that can be helpful? It there a way to make the uncertainty play by our rules?

Identity

The most important part of any design is an identity. I don’t mean the identity from the identity management point of view. I mean the identity that makes the entities, aggregate roots different, the good old-fashioned Id. It’s quite to common to find entities’ properties pointing to another context/domain. A User will have an employee’s identifier, a car in an insurance company will be referenced in accounting and other contexts. Basically, it will spread its carId across the whole company. Have you ever encountered ThisExternalId in your system? I bet you have. It’s time to end this.

One thing, multiple views

It’s not a car that is referenced in the accounting. It’s not a user that is referencing an employee. It’s the same entity spread across different contexts. Let me give you an example.

The same entity, depending on the context, can have a different meaning. A car in an insurance company will be seen in different dimensions. In some contexts it won’t occur at all (mortgage insurance), in others, it will be present. What’s the MAIN context? Can one tell what is it? Again, I bet noone can.

It’s time to end this referential wars. A car is not referenced by this or that context. None of them requires to have a car identifier as a foreign key. Everyone requires to have an id which can have the SAME VALUE in different context. Why? Because the same thing will be understood differently in different contexts! When signing an insurance, it will be just a thing. When paying for an accident it will be another, but after all, it’s the same car.

The same thing can be seen in many contexts. In each of them, it will have a separate set of properties that is unique, but will share one and only one property – the identity. The identity will be probably artificial and uniquely generated (you don’t want to have duplicates, do you?). A perfect match for that would be GUID or UUID.

Snowflakes

When asking what a car is, you could imagine asking all the contexts for the same identifier. Getting an insurance info

GET /insurances/38e5c55b-1b44-4bdc-bd9e-632580736f22

Getting the mailing info

GET /mailing/38e5c55b-1b44-4bdc-bd9e-632580736f22

And so on and so forth. Getting an empty response from a context means nothing but the fact, that an entity does not exist in a specific context. You can see this as a snowflake, which consists of the center which is an identifier, and arms which are responses from different contexts. None of the contexts creates the entity on its own, but every single one contains a meaningful information about the snowflake.

Further modelling

It’s quite easy to follow this rule and imagine adding another context to a system like this. You don’t have to add foreign keys or make changes to other components. You just map a new context with the same bluntly simple rule: share just the identity. After all, if another context wants to handle this entity in some way, it will just use this single meaningful property of an entity, the identity of the snowflake.

I hope that this article shed some light on this modelling technique and showed itself as a solid and an extensible approach towards modelling your domain.

Snapshot time!

It’s snapshot time! There’s been a lot of event sourcing content so far. Let’s do a recap!

Below you will find a short summary of event sourcing related articles that I have published here so far. Treat it as a table of content or a lookup or a pattern collection. It’s ordered by date, the later – the older. Enjoy!

  1. Why did it happen – how to make your event sourced system even easier to reason about
  2. Event sourcing and interim stream – how to embrace new modelling techniques with short living streams
  3. Multitenant Event Sourcing with Azure – how to design a multitenant event sourced system using Azure Storage Services
  4. Rediscover your domain with Event Sourcing – how to use your events and astonish your business with meaningful insights
  5. Event Sourcing for DBAs – a short introduction for any relational person into the amazing world of event sourcing. Can be used as an argument during a conversation.
  6. Enriching your events – what are events metadata and why should we care? how to select the most important ones
  7. Aggregate, an idempotent receiver – how to receive a command or dispatch an event exactly once?
  8. Process managers – what is a process manager, how can you simplify it?
  9. Optimizing queries – how to make queries efficient, especially when dealing with multiple version of the same application running in parallel
  10. Event sourcing and failure handling – an exception is thrown. Is it an event or not? How to deal and model it?
  11. Embracing domain leads towards event oriented design – how event oriented design emerges from understanding of a domain

Why did it happen?

TL;DR

Do you know that feeling of being powerless? Of being not able to tell why your system acted in a specific way? Of not being able to recognize whether it’s a hacker or your system malfunction? Event sourcing, by storing all the events that happened in your system helps a lot. Still, you can improve it and provide much better answers for ‘why did it happen?’.

Why?

The question ‘why did it happen?’ reminds me of one case from my career. It was a few years back, when I was a great fan of AutoMapper. It’s a good tool, but as with every tool one should use it wisely (if you have only a hammer …). I think I stretched it a little to much and landed in a point where nobody was able to tell what the mapping come from. It took only 3 days to provide an extension method Why that was showing which mapping will be applied for a specific object.

I’d say, that being able to answer these ‘whys’ within a reasonable time frame is vital for any project. And I don’t mean only failures. When introducing a junior to your team, being able to show how and why things work is important as well.

Event sourcing

Event sourcing helps a lot. It just stores every business delta, every single change of your domain objects. You can query these changes in any time. You can improve your queries even more by enhancing events with some metadata (which I covered recently in here). There’s a case though that you should consider to tell if the reasoning is always that easy.

Dispatching an event

Some of actions on your aggregates are results of dispatching an event. Something happened and another part of your systems turns that into an action. For instance consider the following

When OrderFinished then AddBonusPoints

Some bonus points are added to an account whenever an order is finished. When looking at an account history, you’ll see a lot of events BonusPointsAdded. Yes, you could introduce a lot of events like BonusPointsAddedBecauseOfOrderFinished but this just leaks the process into your account aggregate. If you don’t do it, can you answer the following question

Why BonusPointsAdded were appended?

Because somebody added points? Yes, but WHY? It looks that the reason is disconnected.

Point back

What if following metadata were added to every event that is a result of another event. In this case, what if the following metadata were added to this specific BonusPointsAdded

enhance

Now, when somebody asked Why did it happen you can easily point to the original event. If that was a reason of some process, of dispatching the event you can follow the link again and again to find the original event that was created because of an user action.

Summary

Links are a powerful tool. If you use it with event sourcing you can get a history of your system that’s easy to navigate, follow and reason about.

Dependency rejection

TL;DR

This values must be updates synchronously or we need referential integrity or we need to make all these service calls together are sentences that unfortunately can be heard in discussions about systems. This post provides a pattern for addressing this remarks

The reality

As Jonas Boner says in his presentation Knock, knock. Who’s there? Reality. We really can’t afford to have one database, one model to fit it all. It’s impossible. We process too much, too fast, with too many components to make it all in one fast call. Not to mention transactions. This is reality and if you want to deny reality good luck with that.

The pattern

The pattern to address this remarks is simple. You need to say no to this absurd. NoNoNo.  This no can be supported with many arguments:

  • independent components should be independent, right? Like in-de-pen-dent. They can’t stop working when others stop. This dependency just can’t be taken.
  • does your business truly ensures that everything is prepared up front? What if somebody buys a computer from you? Would you rather say I’ll get it delivered in one week or first double check with all the components if it’s there, or maybe somebody returned it or maybe call the producer with some speech synthesizer? For sure it’s not an option. This dependency just can’t be taken.

You could come up with many more arguments, which could be summarized simply as a new pattern in the town, The Dependency Rejection.

Next time when your DBA/architect/dev friend/tester dreams about this shiny world of a total consistency and always available services, remind them of this and simply reject dependency on this unrealistic idea.

 

Relaxed Optimistic Concurrency

TL;DR

When using the optimistic concurrency approach for entities that are updated frequently, some of the actions may fail because of the conflicting version numbers. A proper modelling technique distilling if business requirements can be loosened may greatly increase the chances of succeeding with commands issued against these entities improving overall performance of an application and a lowering a probability of errors.

Optimistic Concurrency

The optimistic concurrency is an approach for ensuring non overlapping updates over a given entity. It’s supported by the majority of heavy ORMs and applied simply by adding a conditional where at the end of the update. For example

UPDATE Orders
-- more updated columns
SET version = @version + 1
WHERE id = @id AND version = @version

This approach ensures, that if any other operation updated the entity in the meantime, this update will fail. Additionally, if an ORM is capable of counting rows that should have been updated, like NHibernate does, it can abort a transaction and throw an exception informing that some of the operations that were planned to be executed failed.

The optimistic concurrency approach is not a unique SQL technique. It’s popular in many NoSql databases like Azure Table Storage for example. When updating an entity, its ETag is added as the If-Match header, ensuring, that if the entity was modified after retrieval and updated, the operation, again, will fail. See Update operation documentation here.

Finally, when applying Domain Driven Design and operating on an Aggregate Root, this technique is the easiest one to ensure, that the aggregate root is truly a transaction boundary. If the root has its version updated with every change of the aggregate, then two concurrent operations cannot be executed and one will fail, still, preserving the root as a transaction boundary. This applies to aggregate roots, no matter if you immerse them into Event Sourcing or a regular ORM mapped graph of entities. Just update the root with every operation and your aggregate will be just fine.

As it’s been shown above, optimistic concurrency is a simple and powerful tool that in a world of NoSql and transactional-boundaries-got-right may be the only one to ensure atomicity of operations.

Limitations

When using optimistic concurrency, the flow of applying a change is a bit different. Instead of just updating a property, or a value, the following approach is taken

  1. An aggregate is retrieved with its version
  2. If the state allows it, a command is executed
  3. The aggregates’ state is updated conditionally (if the version is unchanged)

Again, this ensures that the updated is applied on the version that a business logic operated onto, but limits the concurrent access.

For services using Event Sourcing, instead of retrieving entity all of the events are retrieved and a state of an aggregate is rebuilt. If snapshots are used, only events with versions bigger than a snapshot must be retrieved. If the snapshot is preserved in a in memory cache, then possibly, no events will be retrieved if the snapshot’s version is equal to the number of aggregate’s events so far. Events that are a result of a command are appended to the store conditionally. Depending on the storage it can be the stream version when using EventStore AppendToStreamAsync or update of a root markup entity when using a custom relational store.

An example

Let’s consider an example of a GitHub-like issue. Every issue has an option of locking it. It can be used for instance to lock an issue created by a troll (you don’t feed the troll) and disallow adding more comments. For sake of argument:

  1. let’s model all comments as a part of the issue aggregate (as always, there are many models that can be applied)
  2. optimistic concurrency is used for all commands.

A business requirement for locking an issue could look like:

when an issue is locked no user should be able to add more comments

It’s quite common, that when seeing a requirement like this, developers don’t ask questions. It’s even more unfortunate, that some companies require to just follow the analysis. Let’s try to relax this requirement a little bit by asking some questions:

  1. Is it required to lock the issue immediately?
  2. Could an issue be considered locked after some short period of time (less than 1s) after locking it?
  3. Could we allow adding some comments during this period?

If the answers point towards no need of an immediate lock, there’s a space to handle locking in a relaxed manner

Relaxed Optimistic Concurrency

If an operation can have its preconditions relaxed and can be performed after achieving some state it can be executed with much less friction. In the previous example, the state when a user can add a comment is a created issue. The precondition is a non-locked issue, but it’s ok to add a comment to a locked issue within some time boundaries. Consider the following flow

  1. An aggregate is retrieved with its version
  2. If the state allows it, a command is executed
  3. The aggregates’ state is updated conditionally (if the version is unchanged) appending the change unconditionally

Depending on the storage and the applied design in can be done in many ways.

When using Event Sourcing with EventStore a special version can be passed to the appending method which represents any version. This appends events unconditionally. This means that a locking operation and adding a comment can be done in parallel without conflicts!

When using a relational database, an issue entity can be retrieved to check it’s state. Next, a comment entity can be added separately, without updating the version of the issue itself. Again, because adding a comment does not change the version, the friction on the aggregate is lowered.

Summing up

Don’t take requirements for granted, but rather ask for the reasoning behind them. Try to relax requirements for areas which may suffer from the high contention. The model is just a model. There are no true or false models but these which help you or make your work harder. Choose wisely🙂