Orchestrating processes for fun and profit

TL;DR

I’ve already shown here that with some trickery you can write orchestrations in C# based on async-await. It’s time to revisit this idea, now with a custom orchestration of mine.

Show me the code!

The orchestration is based on the event sourcing infrastructure built by me. This project is not public (yet) but it’s similar to any other ES library. The main idea behind building an orchestration on top of it is that state changes of an orchestration are easily captured as events like: GuidGenerated, CallRecorded, UtcDateTimeObtained, etc etc. If this is the case, we could model any orchestration as an event-sourced aggregate.

Here it is an example of reserving a trip, that needs to coordinate between reserving a hotel and a flight. Additionally a few calls were added to just show what the orchestration is capable of

orchestration.png

  • Line 19: orchestration can delay it’s execution. It does not mean that it will stay in memory for that long. It will record the need of delay to be woken up when the time comes
  • Line 22: when a Guid is generated with Orchestration.NewGuid it means that it will have the same value when the orchestration is executed again. Process dies and the orchestration is executed elsewhere? Don’t you worry, the guid value has been recorded
  • Line 25: same with dates
  • Line 28: an external call to any system. Here, we try to ReserveHotel. 3 attempts are made with a exponential back off. Between calls, if the configured timespan is long enough, saga can be swapped from memory (like with Delay)
  • Line 33: same as with hotel
  • Line 38: the compensation part

Execute it!

The execution would take the persisted events and replay them if process failed, machine died, etc. on another host. Because before every external call, all the recorded events are persisted, it ensures that even during a rerun, all the values will be exactly the same and calls that were already made won’t be made again.

Summary

Orchestrating with async-await trickery is much easier and can be written in a dense, simple way of a regular method call. Having an event-sourced foundation makes it even easier as you can use already existing pieces and persist all orchestration events as event of some kind of aggregate.

Top Domain Model: Don’t be so implicit

TL;DR

Make things implicit (more) explicit. This is the mantra we often hear, but when touching the reality of software development, often forget, because of the magic beauty of technology, tooling, etc. Should we repeat this, when modelling domains as well?

Auditing

It doesn’t matter whether you apply DDD or CRUDdy. It’s quite often to hear that a system should provide a some kind of audit. Then we provide a table like History or Audit, where all the “needed” data are stored. What would be the columns of this table? Probably things like:

  • user
  • operation
  • field changed
  • value entered

or something similar. Again, depending on the approach these data could flow through the system as some kind of tuple, before hitting the storage.

When dealing with a business process that uses an accountant id, just for registering who’s responsible for tax calculations, would you “reuse” the user from a tuple above, or would you rather create a variable/field that is named accountantId, even if the values in accountantId and user would be the same? What would be your choice?

Event sourcing

ProperyChanged, is one of the worst events ever. Yes, it’s so beautifully generic and so well describing the idea of event sourcing, grasping changes in the model. It can be gathered implicitly, just by comparing an object before and after applying an action. At the same time it’s so useless, as it provides no domain insight and could be blindly used in any case. This approach is often referred to as property sourcing and is simply wrong.

The power of event sourcing, or modelling events at all, is grasping real business deltas of the domain. No playing mambo-jumbo with the tooling and implementing the new minimal change finder. Being explicit is the key to win this, to design a better model.

Summary

There are many more examples of explicit vs implicit and abusing modelling by playing not with the domain but with the tooling. When modelling, analyzing, ask yourself these questions. Should I really reuse it? Should I really make it generic? Quite often the answer will be no, and trying to do this, might hurt your model, and effectively, you.

You’re using wrong numbers

TL;DR

When testing, we often use constants to show that a number used in two places has the same meaning, not only value. How can we apply this when writing documentation or giving examples of algorithms you want to use?

Meaningless enough

Consider the first example

2 – 1 = 1
1 – 0 = 1
1 + 1 = 2

Can you guess which “ones” were added at the last line? They are results of the first two equations. Or maybe not? Let’s take a look at the slightly altered example

12342 – 1 = 12341
1 – 33333 = 33332
1 + 1 = 2

Can you now see which ones were used? It’s obvious.

Number as identifiers

When working with numbers you may try to use them as identifiers. When a value is needed only for a local calculation make it meaningless enough. A reader won’t spend much time on it. They will read and remember one-digit numbers and apply them in following lines

Summary

Writing documentation isn’t an easy task. Writing good documentation is even harder. When dealing with algorithms, making numbers unique can improve the overall understanding of samples you provide.

Top Domain Model: behaviors, processes and reactions

TL;DR

After pivoting two nights straight (here and here) it’s time to settle down and ask is it possible to stick just to aggregates to model your domain? Is it possible to make it any better with services or maybe another tool is needed?

Actions and state

Aggregates define the boundaries of transactions. That’s their primary target. They narrow down not only our thinking (in a positive sense), but also provide an ability for scaling out our model. If every aggregate sits in its own bucket, there’s nothing easier to save it as a document, or a stream of events within one partition. Their boundaries won’t be ever crossed. The last one is actually a wishful thinking. Who would like to transfer money without ever receiving them on the other side?

Actions and reactions

Think in processes. One account is debited, the other is credited. The second does not have to happen in the same transaction. Moreover, the second, in its nature, is a reaction to the first account being debited. In a natural world reactions often happen at the same time. Modelling them in this way (transactions across all the aggregates) might be more realistic but for sure, is not more useful. The more elements in the transaction, the higher probability of failing… and so on and so forth.

Aggregates + processes

The above is a description of a perfect couple. Aggregates and processes are two tools that you can use to model any domain. Actually, if you dared to say that you don’t need processes, I’d say that you could be wasting your time modelling a really boring and simple domain with a DDD tooling. From the tooling perspective, whether you want to event-source everything all the way or use good old fashioned orm with a message bus, it’s up to you. Just remember, that the true power in your model comes from combining aggregates and processes.

Summary

Modelling aggregates that just work (TM) is impossible. Any, even moderately complex domain contains a lot of reactions and processes that need to be captured in your model to make it usable. Next time, you think about an aggregate reacting to something, ask yourself, what kind of process is this, how should I name it, how to model it?

Cloudy cost awareness

TL;DR

Our industry was forgiving, very forgiving. You could not put an index, run a query for 1 minutes and some users of your app would be disappointed. If you were the only one on the market or delivered banking systems, that was just fine as you’d no loose clients because of it. The public cloud changes it and if you can’t embrace it, you will pay. You will pay a lot.

Pay as you crawl

If you issue a query scanning a table in Azure Table Storage, every entity you access will be counted as a storage transaction. Run millions of them and your bill will be increased. Maybe not that much, but it will.

If you deploy a set of services as Azure Cloud Services, each of them consuming just 100MB of memory, your VMs will be undersaturated. You’ll pay for memory you don’t use and CPU that just sits in the rack that hosts your VM.

Design is money

Before public cloud, all these inefficiencies could be more or less tolerated, but were not that easy to spot on. Nowadays, with a public cloud, it’s the provider, the host that will notice them and charge you for them. If you don’t design your systems with the awareness of the environment, you will pay more.

Mitigations

This is not black or white situation. It never is. You’ll probably be able to dockerize some parts of your app and host it inside of Service Fabric cluster. You’ll probably be able to use CosmosDB and its autoindexing feature to just fix the performance for lookups in your Azure Storage Tables. There’s a lot of ways to mitigate these effects, but still, I consider a good appropriate design as the most valuable tool for making your systems not only well performing and effective but, eventually, cheap.

Summary

Don’t throw your app against the wall of clouds and check if its sticks. Design it properly. Otherwise, it may stick in a very painful and cost ineffective way.

Top Domain Model: I’ve been pivoting all night long. Again

TL;DR

In the last entry we noticed that we can model out some problems by selecting a different aggregate an event belongs to. In this episode, we will revisit this idea by referring to the good old-fashioned Twitter and its timelines. How timelines are created? Is everyone treated equally? Is it only aggregates’ identities that should be pivoted or maybe our thinking as well?

Fans in, fans out

The model behind Twitter timelines is quite simple. It’s also different for people having 10 followers or 10 millions. The difference is the number of timelines that needs to be updated when Madonna writes a tweet or a @JustAnotherAccountLikingMadonna2019 retweets one of their favorite performer thoughts. Notifying 10 millions people is not that easy and takes a bit longer. Writing to 10 timelines is easy and fast.

Golden customer

A similar reasoning might be added to a golden customer profile or any kind of dimension that distinguishes an aggregate type in your model. Sometimes, a behavior will be slightly altered, sometimes it will be totally rewired. The most important idea is to don’t stick to the initial model. Let yourself ask important questions to distill and clarify it. To discover these golden customers that require different handling, to discover Madonnas that require cargo shipping to deliver tons of their tweets and so on and so forth.

A new type or not

Sometimes, this reasoning will bring you new types of aggregates, sometimes it will be a process based on a dimension distinguishing a Madonna-like aggregate instance. As always, every model will be wrong, but some of them will be more useful than others.

Summary

Don’t stick to the initial aggregate type. Ask more modelling questions, that can bring you new types of aggregates or processes depending on a specific dimension value of already existing aggregate. Pivot not only identities, but also your thinking.