Shallow and deep foundations of your architecture

TL;DR

This entry addresses some of the ideas related to various types of foundations one can use to create a powerful architectures. I see this post as somewhat resonating with the Gregor Hohpe approach for architecture selling options.

Deep foundations

The deep/shallow foundations allegory came to me after running my workshop about event sourcing in .NET. One of the main properties of an event store was the fact whether it was able to provide a linearized view of all of its events. Having or not this property was vital for providing a simple marker for projections. After all, if all the events have a position, one could easily track only this one number to ensure, that an event was processed or not.

This property laid out a strong foundation for simplicity and process management. Having it or not, was vital for providing one design or another. This was a deep foundation, that was doing a lot of heavy-lifting of the design of processes and views. Opting out later on, from a store that provides this property, wouldn’t be that easy.

You could rephrase it as having strong requirements for a component. On the other hand, I like to think about it as a component providing deep foundations for the emerging design of a system.

Shallow foundations

The other part of a solution was based on something I called Dummy Database. A store that has only two operations PUT & GET, without transactions, optimistic versioning etc. With a good design of the process that can store its progress just by storing a single marker, one could easily serialize it with a partition of a view and store it in a database. What kind of database would it be? Probably any. Any SQL database, Cassandra or Azure Storage Tables are sufficient enough to make it happen.

Moving up and down

With these two types of foundations you have some potential for naturally moving your structure. The deep foundations provide a lot of constraints that can’t be changed that easily, but the rest, founded on the shallow ones, can be changed easily. Potentially, one could swap a whole component or adjust it to an already existing infrastructure. It’s no longer a list of requirement to satisfy for a brand new shiny architecture of your system. The list is much shorter and it’s ended with a question “the rest? we’ll take whatever you have already installed”.

Summary

When designing, split the parts that need to be rock solid and can’t be changed easily from the ones that can. Keep it in mind when moving forward and do not poor to much concrete under parts where a regular stone would just work.

Service kata with Business Rules

TL;DR

In the previous post we started working on a code kata and discovered that instead of creating a new monolithic giant we could tackle the complexity of a process by modelling it right in its natural boundaries: contexts. It this post we continue this journey.

Requesting payment

Let’s spend some time on modelling a process of ordering a membership. It’s been said that it requires a payment and as soon as the payment is done, the membership is activated. We introduced the PaymentReceived event as an asynchronous response to the payment request.

Consider a membership order with the following identifier

11112222-3333-4444-5555-666677778888

When accepting the request for a membership, Membership sends a request to the Payments with the following information

payment_request

It is important to see, that the caller generates identifier, which has following properties:

  • In this case it reuses it’s own identifier for a different context to use snowy identifiers to create snowflake entities
  • As the caller generates id and stores it, in case of the failure when requesting a payment, it can be POSTed again as it’s idempotent (any http status indicating that it already exists means that the previous call was accepted and processed).

Using approach in a service oriented architecture enables idempotence (everyone knows the id upfront).

Events strike back

The result of the payment, after receiving money is an event PaymentReceived which is published to all the interested parties. One of them will be Membership, which would simply take the paymentId and check if there’s an order for a membership with the same identifier. It there is, it can be checked as paid. Simple and easy. The same will apply to other rules in other contexts.

There’s really no point of making this ONE BIG APP TO RULE THEM ALL. You can separate services according to business units and design towards integrating them.

Again, depending on the used tools, you can have events delivered by a bus to all the subscribers all use ATOM feed to publish events in there, and consume them by polling from other services.

Summary

These two posts show that raising modelling questions is important and that it can help to reuse existing structures and applications in creating new robust systems. They do not cover transactions, retries and more. You can use tools that solve it for you like a messaging bus or you’ll need to handle it on your own. Whatever path you choose, the modelling techniques will be generally the same and you can use them to bring real value into the existing ecosystem instead of creating the single new shiny application that will rule them all.

 

Code kata with Business Rules

TL;DR

How many times you were given an implicit requirements that you’d create one application or two services? How many times the architecture and design were predetermined before any modelling with business stake holders? Let’s take a dive into a code kata, that will reveal much more than code.

Kata

The kata we’ll be working on is presented here. It covers writing a tool for a set of business rules gathered across the whole company. The business rules depends on the payment (the fact that it is done) and some other conditions, for example:

  • If the payment is for a physical product, generate a packing slip for shipping.
  • If the payment is for a book, create a duplicate packing slip for the royalty department.
  • If the payment is for a membership, activate that membership.

The starting point for every rule is a payment that is accepted. Another observation is that these rules are scattered across the whole company (as the author mentions Carol on the second floor handles that kind of order). Do you think that having a new single application that gathers all the rules is the way to go?

Contexts

If a mythical Carol is responsible for some part of the rules, maybe another department/team is responsible for membership? What about the payments? Is the bookstore part of your organization really interested if a payment was done with a credit card or a transfer? Is a membership rule really valid outside of the membership context? Is it needed to anyone without membership specific knowledge should be able to say when the membership is activated?

I hope you see the way through these questions. There are multiple contexts that are somehow dependent but that are not a truly one:

  • payments – a part responsible for accepting money and making them transferred to the company’s account
  • membership – taking care of (possibly) accounts, monitoring activity, activating/disactivating accounts
  • bookstore/videostore or simply store – the sales part
  • shipping – for physical products

Are these areas connected? Of course they are!

PaymentReceived

The first visible connector is the payment. To be precise, the fact of receiving it, which can be described in a passive tense PaymentReceived. You can imagine, when requesting a membership, a payment is required. This can be perceived as a whole process, but could be split into following phases:

  • gathering membership data
  • requesting a payment
  • receiving a payment
  • completing the membership order

This is the Membership point of view. As you can see it requests a payment but does not handle it. We will see in the next post how it can be solved.

 

Snowy identifiers

TL;DR

When using the snowflake entities pattern, it’s quite easy to forget about using external identifiers that we need to communicate with external systems. This post provides an easy way to address this concern.

Identity revisited

The identifier of a snowflake entity was presented as a guid. We use an artificial non-colliding client-generated identifier to ensure, that any part of the system can generate one without validating that a specific value hasn’t been used before. This enables storing different pieces of data, belonging to different contexts in different services of our system. No system leaves in vacuum though, and sometimes it requires communication with the rest of the world.

Gate away!

A common aspect that is handled by an external system are payments. When you consider credit cards, native bank applications, PayPay, BitCoin and all the rest, providing that kind of a service on your own is not a reasonable option. That’s why external services are used – the price of using one is much cheaper than delivering one. Let’s stick to the payments example. How would you approach this? Would you call the external payment service from each of your services? I hope you’d not. A better approach is to create a gateway, that will act as a translator between your system and the external one.

How many ids do I need?

Using a gateway provides a really interesting property. As the payment gateway is a part of your system, it can use the snowflake identifier. In other words, if there’s an order, it’s ok (under given circumstances) to use its identifier as identifier of the payment as well. Of course if you want to model these two as a part of a snowflake entity spanning across services. It’d be the payment gateway responsibility to correlate the system snowflake identifier with the external system id (integer, some string, whatever). This would create a coherent view of an entity within your system boundaries, closing the mapping in a small dedicated area of the payment gateway.

An integration with an external system closed in a small component leaving your system agnostic to this? Do we need more?

Summary

As you can see, closing the external dependency as a gateway provides value not only by separating the interface of the external provider from your system components, but also preserves a coherent (but distributed) view of your entities.

Dependency rejection

TL;DR

This values must be updates synchronously or we need referential integrity or we need to make all these service calls together are sentences that unfortunately can be heard in discussions about systems. This post provides a pattern for addressing this remarks

The reality

As Jonas Boner says in his presentation Knock, knock. Who’s there? Reality. We really can’t afford to have one database, one model to fit it all. It’s impossible. We process too much, too fast, with too many components to make it all in one fast call. Not to mention transactions. This is reality and if you want to deny reality good luck with that.

The pattern

The pattern to address this remarks is simple. You need to say no to this absurd. NoNoNo.  This no can be supported with many arguments:

  • independent components should be independent, right? Like in-de-pen-dent. They can’t stop working when others stop. This dependency just can’t be taken.
  • does your business truly ensures that everything is prepared up front? What if somebody buys a computer from you? Would you rather say I’ll get it delivered in one week or first double check with all the components if it’s there, or maybe somebody returned it or maybe call the producer with some speech synthesizer? For sure it’s not an option. This dependency just can’t be taken.

You could come up with many more arguments, which could be summarized simply as a new pattern in the town, The Dependency Rejection.

Next time when your DBA/architect/dev friend/tester dreams about this shiny world of a total consistency and always available services, remind them of this and simply reject dependency on this unrealistic idea.

 

Data has no format

I need to be able to store 1GB of JSON

I’d like to push XML 100 MB/s to this Azure blob

I need to log this data as CSV

Statements like this are sometimes true, but in the majority of cases the format is not given and is a part of designing your architecture/application. Or redesigning if needed. Selecting a proper format can lower the size of your data, increasing the throughput of your system, if a medium like a disk or a network is saturated. That’s why systems like Apache Arrow or Google’s Dremel use their own formats. That’s why you may consider using the protobuf-net serialization for EventStore, disabling it build in v8 projections and lowering size of events at the same time. For low latency systems you can choose the new library Simple Binary Encoding. That’s why sometimes storing data in another format is simply better. I’ve written a blog post Do we really need all these data tranformations and it doesn’t state something opposite. It’s all about making a rational and proper choices of the storage format and taking into consideration different aspects of it and its influence on your system. With this one decision you might improve your system performance greatly.

Feature oriented design got wrong

The fourth link in my google search for ‘feature toggle’ is a link to this Building Real Software post. It’s about not about feature toggles described by Martin Folwer. It’s about feature toggles got wrong.

If you consider toggling features with flags and apply it literally, what you get is a lot of branching. That’s all. Some tests should be written twice to handle a positive and a negative scenario for the branch. The reason for this is a design not prepared to handle toggling properly. In the majority of cases, it’s a design which is not feature-based on its own.

The featured based design is created on the basis of closed components, which handle the given domain aspect. Some of them may be big like ‘basket’, some may be much smaller, like ‘notifications’ reacting to various changes and displaying needed information. The important thing is to design the features as closed components. Once you have it done this way, it’s easier to think about the page without notifications or ads. Again, disabling the feature is not a mere flag thrown in different pieces of code. It’s disabling or replacing the whole feature.

One of my favorite architecture styles, event driven architecture helps in a great manner to build this kind of toggles. It’s quite easy to simply… not handle the event at all. If you consider the notifications, if they are disabled, they simply do not react to various events like ‘order-processed’, etc. The separate story is to not create cycles of dependencies, but still, if you consider the reactive nature of connections between features, that’s a great enabler for introducing toggling with all of advantages one can derive from it with A/B tests, canary releases in mind.

I’m not a fan boy of feature toggling, I consider it as an important tool in architects arsenal though.