Concurrent conditional deletes

TL;DR

Sometimes System.Collections.Concurrent provide not enough methods to squeeze the top performance out of them. Below you can find a small extension method that will make your life easier in some cases

Concurrent dictionary

The concurrent dictionary provides a lot of useful methods. You can TryAdd, AddOrUpdate. Additionally, you can use TryUpdate which provides a very nice method for ensuring optimistic concurrency. Let’s take a look at its signature:

public bool TryUpdate(TKey key, TValue newValue, TValue comparisonValue)

It enables to replace a value under a specific key with the newValue only if the previous value is equal to comparisonValue. It’s an extremely powerful API. If you create a new object for a specific key to update it, it enables to replace that object without locks with a single method call. Of course, if the method fails, it returns false and it’s up to you to retry.

What about deletes? What if I wanted to remove an entry only if nobody changed it’s value in the meantime? What if I wanted to have an optimistic concurrency for deletes as well. Is there a method for it? Unfortunately no. The only method for removal is

public bool TryRemove(TKey key, out TValue value)

which removes the value unconditionally returning it. This breaks the optimistic concurrency as we can’t ensure that the removed entry wasn’t modified. What can be done to make it conditional?

Explicit interfaces

The ConcurrentDictionary class implements a lot of interfaces, one of them is

ICollection<KeyValuePair<TKey, TValue>>

This interface has one particular method, enabling to remove a pair of values.

ICollection<KeyValuePair<TKey, TValue>>.Remove(KeyValuePair<TKey, TValue> kvp

If you take a look into implementation, it uses a private method of the dictionary to remove the key only if the value is equal to the value of the pair. Now we can write a simple extension method to provide a conditional, optimistically concurrent removal of a key

static class ConcurrentDictionaryExtensions
{
    public static bool TryRemoveConditionally<TKey, TValue>(
        this ConcurrentDictionary<TKey, TValue> dictionary, TKey key, TValue previousValueToCompare)
    {
        var collection = (ICollection<KeyValuePair<TKey, TValue>>)dictionary;
        var toRemove = new KeyValuePair<TKey, TValue>(key, previousValueToCompare);
        return collection.Remove(toRemove);
    }
}

which closes the gap and makes the API of the concurrent dictionary support all the operations under optimistic concurrency.

Summary

With this simple tweak you can use a concurrent dictionary as a collection that supports fully an optimistic concurrency.

Rediscover your domain with Event Sourcing

TL;DR

Beside common advantages of event sourcing like the auditing, projections and sticking closely to the domain, you can use events to discover the domain again and provide meaningful insights to your business.

Include.Metadata

I’ve already described the idea of enriching your events. This is the main enabler for analyzing your events in various way. The basic metadata one could are:

  • date
  • action performer
  • on behalf of who action is taken

You could add a screen of your app and IP address and many many more.

Reason

Having these additional data, it’s quite easy to aggregate all the events of a specific user. This, with time attached, could provide various information:

  • how the work is distributed during a work day
  • how big is the area of business handled by a single user
  • is the user behavior pattern the same all the time or maybe somebody has overtaken this account?

The same with projection by event type:

  • is it a frequent business event
  • are these event clustered in time – maybe two events are only one event

Or looking at a mixed projection finding sequences of events for a user that might indicate:

  • an opportunity for remodeling your implementation
  • finding hot spots in the application.

Rediscover

stocksnap_crlx1ud87t

All of the above may be treated as simple aggregations/projections. On the other hand, they may provide important trends for a system and might be used to get an event based insight to the business domain. Can you imagine the business being informed about a high probability of a successful cross selling of two or three products? That’s where a competitive advantage can be born.

 

 

Event sourcing for DBAs

TL;DR

Are you in a team trying to convince a database administrator  to use event sourcing? Are you a DBA that is being convinced? Or maybe it’s you and a person that does not want to change the relational point of view. In any case, read forward.

Drop. Every. Single. Table

That’s the argument you should start with. You can drop every single table and your database will be just right. Why is that? There’s a recovery option. If you don’t run your database with some silly settings, you can just recover all the tables with all the data. Alright, but where do I recover from? What’s the source of truth if I don’t have tables.

Log

The tables are just a cache. All the data are stored in the transaction file log. Yes, the file that is always to big. If all the data are in there, can it be queried somehow?

SELECT * FROM fn_dblog(null, null)

That’s the query using an undocumented SQL Server function. It simply reads a database log file and displays it as a nicely formatted table like this:

dblog

Take a look at the operation column now. What can you see?

  • LOP_INSERT_ROWS
  • LOP_MODIFY_ROW
  • LOP_MODIFY_COLUMNS
  • LOP_COMMIT_XACT
  • LOP_BEGIN_XACT
  • LOP_COMMIT_XACT

What are these? That’s right. That’s every single change that has been applied to this database so far. If you spent enough of time with these data, you’d be able to decode  payloads or changes applied to the database in every single event. Having this said, it looks like the database itself is built with an append only store that saves all the changes done to the db.

Event Sourcing

Event sourcing does exactly the same. Instead of using database terms and names of operations, it incorporates the business language, so that you won’t find LOP_MODIFY_COLUMNS with a payload that needs to be parsed, but rather an event with an explicit business related name appended to a store. Or to a log, if you want it call it this way. Of course there’s a cost of making tables out of it once again, but it pays back by pushing the business understanding and meaning much deeper into the system and bringing the business, closer to the system.

At the end these tables will be needed as well to query something. They won’t be treated as a source of truth, they are just a cache anyway, right? The truth is in the log.

Dependency rejection

TL;DR

This values must be updates synchronously or we need referential integrity or we need to make all these service calls together are sentences that unfortunately can be heard in discussions about systems. This post provides a pattern for addressing this remarks

The reality

As Jonas Boner says in his presentation Knock, knock. Who’s there? Reality. We really can’t afford to have one database, one model to fit it all. It’s impossible. We process too much, too fast, with too many components to make it all in one fast call. Not to mention transactions. This is reality and if you want to deny reality good luck with that.

The pattern

The pattern to address this remarks is simple. You need to say no to this absurd. NoNoNo.  This no can be supported with many arguments:

  • independent components should be independent, right? Like in-de-pen-dent. They can’t stop working when others stop. This dependency just can’t be taken.
  • does your business truly ensures that everything is prepared up front? What if somebody buys a computer from you? Would you rather say I’ll get it delivered in one week or first double check with all the components if it’s there, or maybe somebody returned it or maybe call the producer with some speech synthesizer? For sure it’s not an option. This dependency just can’t be taken.

You could come up with many more arguments, which could be summarized simply as a new pattern in the town, The Dependency Rejection.

Next time when your DBA/architect/dev friend/tester dreams about this shiny world of a total consistency and always available services, remind them of this and simply reject dependency on this unrealistic idea.

 

Big Ball of Fun

TL; DR
How to deal better with legacy code.

Mental

There’s a very useful tool, that everybody uses but only a few recognize in their behavior. This tool is called a mental model. It’s a way of thinking, that could help or harm your ability to deal with a specific situation. For instance, when somebody overtakes your car, you might think that he/she is an idiot, or leave some space for an interpretation saying ‘it must be something urgent’. Depending on your choice, your significant other and your child may learn a new combination of swearwords that never existed before. Just to be clear: I’m not saying that inventing these combinations is useful, but rather the opposite. The most important thing is to be more aware of your reactions and consciously build models that fit you and help you.

Fun fun fun fun

What’s better, mud or fun? If you were Peppa Pig, then you’d put an equality sign between them. But you aren’t. The mud is dirty and stinky and fun is just … fun. What would you like to approach: fun or mud? I bet the answer is fun. Next time when you need to work on a piece of a legacy, whether it’s a COBOLac or Spaghetti Visualo Basico, try to use this Big Ball of Fun and share this term in your team. This might remove one of the obstacles and turn it into something more approachable. You still might have some others, but one will be gone.

Let’s have some fun!

Concurrency – ramp up your skills

Yesterday, I gave my Extreme Concurrency talk at rg-dev user group. After the talk I had some really interesting discussions and was asked to provide some resources in the low level concurrency I was talking about. So here’s the list of books, talks and blog posts that can help you to ramp up your skills

Videos:

  1. [C++] Herb Sutter “atomic Weapons” – it’s about C++ but covers memory models in a way, that’s easy to follow and learn how it works
    1. Part 1
    2. Part 2

MSDN:

  1. .NET Volatile class – it has a good description of what half-barriers are and properly shows two counterparts Read & Write methods
  2. .NET Interlocked class – the other class with a good description providing methods that are executed atomically. Basically, these methods are JITted as single assembler operations.

Code:

  1. RampUp – a project of mine🙂
  2. [JAVA] Aeron – the messaging library

Books:

  1. Concurrent programming on Windows by Joe Duffy – this is a hard book to go through. It’s demanding and requires a lot of effort but is the best book if you want to really understand this topic

Blogs:

  1. Volatile reads and writes by Joe Duffy
  2. Sayonara volatile by Joe Duffy
  3. Atomicity, volatility and immutability are different by Eric Lippert – that’s the last part of this series
  4. [JAVA] Psychosomatic, lobotomy, saw – the name is strange but you won’t find here disturbing videos. What you’ll find though, is a deep-dive into memory models.

Cel: MVP

[PL]

To będzie krótki post. Po spojrzeniu na moją aktywność w roku 2016 postanowiłem zgłosić swoją kandydaturę na MVP w kategorii Visual Studio and Development Technologies. Wiem, że to dopiero październik i wcześnie na podsumowania, ale biorąc pod uwagę liczbę prezentacji, kontrybucji do projektów Open Source, zorganizowanych spotkań Warszawskiej Grupy .NET i postów na tym blogu stwierdzam, że nastał czas na zgłoszenie się. Link do strony do zgłaszania nominacji znajduje się tutaj: https://mvp.microsoft.com/en-us/Nomination/NominateAnMvp Możesz użyć go, aby zgłosić mnie raz jeszcze. Mój email to scooletz@gmail.com

Czasami programujemy w parach, dlaczego więc nie zgłaszać kandydatur parami? Dodatkowo poza mną zgłosiłem Konrada Kokosę. Znacie go dobrze z prezentacji i bloga dotyczącego performance’u i pamięci. Więcej o jego kandydaturze możesz przeczytać tutaj.

[EN]

It will be a short entry. After looking at my activities in 2016 I decided to nominate myself for the MVP award in Visual Studio and Development Technologies. I know it’s quite early to summarize a whole year, but considering number of talks, contributions to Open Source projects, meetings of Warsaw .NET User Group and posts on this blog I can tell that it’s the right time to make it. You can use following link to nominate a person https://mvp.microsoft.com/en-us/Nomination/NominateAnMvp and you can use it to nominate me once again.

Sometimes we program in pairs, sometimes we nominate in pairs. That’s why I nominated Konrad Kokosa. You can read about his candidature more here.