Feature oriented design got wrong

The fourth link in my google search for ‘feature toggle’ is a link to this Building Real Software post. It’s about not about feature toggles described by Martin Folwer. It’s about feature toggles got wrong.

If you consider toggling features with flags and apply it literally, what you get is a lot of branching. That’s all. Some tests should be written twice to handle a positive and a negative scenario for the branch. The reason for this is a design not prepared to handle toggling properly. In the majority of cases, it’s a design which is not feature-based on its own.

The featured based design is created on the basis of closed components, which handle the given domain aspect. Some of them may be big like ‘basket’, some may be much smaller, like ‘notifications’ reacting to various changes and displaying needed information. The important thing is to design the features as closed components. Once you have it done this way, it’s easier to think about the page without notifications or ads. Again, disabling the feature is not a mere flag thrown in different pieces of code. It’s disabling or replacing the whole feature.

One of my favorite architecture styles, event driven architecture helps in a great manner to build this kind of toggles. It’s quite easy to simply… not handle the event at all. If you consider the notifications, if they are disabled, they simply do not react to various events like ‘order-processed’, etc. The separate story is to not create cycles of dependencies, but still, if you consider the reactive nature of connections between features, that’s a great enabler for introducing toggling with all of advantages one can derive from it with A/B tests, canary releases in mind.

I’m not a fan boy of feature toggling, I consider it as an important tool in architects arsenal though.

 

The cost of scan queries in Azure Table Storage

There are multiple articles describing the performance of Azure Table Storage. You probably read the entry of Troy Hunt, Working with 154 million records on Azure Table Storage…. You may have invested your time in reading How to get most out of Windows Azure Tables as well. My question is have you really considered the limitations of the queries, specifically scan queries and how they can consume the major part of Azure Performance Targets.

The PartitionKey and RowKey create the primary and the only index in ATS (Azure Table Storage). Depending on the query the following kinds can be distinguished:

  1. Point Queries, which are queries to retrieve a single entity by specifying a single PartitionKey and RowKey using equality as predicate
  2. Row Range Queries, which  are queries to get a set of entities defined with the same PartitionKey and a range of RowKeys
  3. Partition Range Queries, which are run with a range of ParitionKeys
  4. Full table scans, which have no predicate for ParitionKey

What are the costs and limitations of the following queries? Unfortunately, every row that is accessed by the query to perform scan over will be counted as the table operation, Tthere ain’t no such thing as a free lunch. This means, that if you scan your entire table (4th scenario), you’ll be able to process no more than 20,000 entities per second. This limits the usage of large data sets’ scans. If you have to model queries across different keys, then you may consider storing the same value twice: once under the natural Parition/RowKey pair and the second time to match the other index, to create an inverted index. If any case, you’ll have to scan through the entire data set, then using ATS is not the way to go, and you should consider some other ways of modelling your data, like asynchronous copy data to blob, etc.

Lokad.CQRS Retrospective

In the recent post Rinat Abdullin provides a retrospective for Lokad.CQRS framework which was/is a starting point for many CQRS journeys. It’s worth to mention that Rinat is the author of this library. The whole article may sound a bit harsh, it provides a great retrospection from the author’s and user’s point of view though.

I agree with the majority points of this post. The library provided abstractions allowing to change the storage engine, but the directions taken were very limiting. The tooling for messages, ddd console, was the thing at the beginning, but after spending a few days with it, I didn’t use it anyway. The library encouraged to use one-way messaging all the way down, to separate every piece. Today, when CQRS mailing lists are filled with messages like ‘you don’t have to use queues all the time’ and CQRS people are much more aware of the ability to handle the requests synchronously it’d be easier to give some directions.

The author finishes with

So, Lokad.CQRS was a big mistake of mine. I’m really sorry if you were affected by it in a bad way.

Hopefully, this recollection of my mistakes either provided you with some insights or simply entertained.

which I totally disagree with! Lokad.CQRS was the tool that shaped thinking of many people, when nothing like that was available on the market. Personally, it helped me to build a event-driven project (you can see the presentation about this here) based on somehow on Lokad.CQRS but with other abstractions and targeted at very good performance, not to mention living documentation built with Mono.Cecil.

Summary

Lokad.CQRS was a ground breaking library providing a bit too much tooling and abstracting too many things. I’m really glad if it helped you to learn about CQRS as it helped me.¬† Without this, I wouldn’t ask all the questions and wouldn’t learn so much.

The provided retrospective is invaluable and brings a lot of insights. I’m wishing you all to make that kind of ground breaking mistakes someday.

GitFlow and Continuous Build

In the recent post I’ve described the idea how to ensure, that your feature-to-develop GitFlow merge commits are reviewed before being introduced to the develop branch. This preserves the quality of the develop branch, ensuring that it’s truly possibly deployable. How one would like to build his/her repository and provide artifacts? Which commits and which branches should be built? These questions are answered below.

Let’s start with the following observation. Whichever branch points at the given commit, if a proper modern build approach is used like PSake build script is used, the result is the same. The repository contains all the needed scripts to run the build, the output will be the same no matter which branch is selected as a source of the build (if two or more points at the same commit). After all the same commit is the same tree which results in the same build. This gives us a very powerful tool in ensuring even better quality of develop. One can easily setup TeamCity using branch selector to run the same build for all features:

+:refs/heads/feature/*

The build script creates artifacts, in my case NuGet packages, using the following versioning [major].[minor].[build_number]. The first two are the numbers stored in the repository. This requires that features are not long running (you don’t want to have a long running from 1.1.1 to get to later than 2.1.2). The build number is the same for all the features. For now, I’m not considering case whether the artifacts should be published to some gallery or not.

The question is, should we build the development commits? For now, considering only the feature branches, the answer is no! All the commits in the develop are merge commits which have been already reviewed and build in the feature branches! That’s why creating the merge commit in the feature is so powerful. You get the code reviewed, built and tested before it goes to the development! Again, the idea of postponing finishing a feature till it has its value acknowledged bring profit to the quality of the develop branch.

 

GitFlow and code reviews

Presentation1Many companies that uses GIT as its code repository use git-flow as the branching model for its projects. The question, that one can come up with is where and when the code review should be taken? Should it be before:

git flow feature finish MYFEATURE

If yes, then the reviewer looks through some version of code, but still, the person responsible for closing the feature creates a a new merge commit after all which can change a lot. On the other hand, if the reviewer creates the merge commit, he/she may not know all the aspects needed for a successful merge. There are a few pages trying to answer this, but still I haven’t found anything satisfying. Please read further for my proposal of clean and proper code review process with git-flow.

The problem with git-flow is the fact that finishing a feature is an atomic action. In one action the author does plenty of things

  1. Author: pushes the commit to the remote
  2. Reviewer: reviews the code
  3. A: creates a merge commit (new commit created!)
  4. A: moves the develop cursor to the new commit
  5. A: deletes the feature branch
  6. A: pushes to the remote

As always, splitting one action into multiple and then proper grouping might help. Consider the following flow including the author and the reviewer.

  1. Author: creates a merge commit
  2. A: moves the feature/a to the merge commit
  3. A: pushes the merge commit to the remote
  4. Reviewer: reviews the merge commit
  5. R: if the review is successful, rebase the develop with fast-forward to the merge commit (some automation can be introduced).

According to this flow, the reviewer always reviews the commit which will land in the develop. Additionally, this makes the author of the feature branch aware of merge problems before he/she pushes out his/her work to the review. Effectively, a feature can be completed and merged and simply wait for acceptance which then, is a simple “go/no go” without considering merge conflicts later on.

This isn’t a git-flow anymore, but still, it is a solid flow to lower the context switching of the author. Ones he/she finishes, it’s a real end, not an entry for incoming merge.

Pact mock

The 2015 edition of CraftConf ended a few days ago. One of the talks I was eager to listen to was Mary Poppendieck (Poppendieck.LLC) – The New New Software Development Game: Containers, … If you haven’t seen it yet, please do. It’s an hour which is well spent.

One of the topics Mary mentions is the pact mock. The pact is an intelligent recorder and player which lets you to turn your integration tests into tests that look like integration, but using prerecorded request-response pairs. The funny thing is that this hits the very same topic I was presenting and discussing in my current project.

I’m dealing with a legacy project right now. It includes VB.NET as well, so yes, this is a real legacy;) There are some web services on board. Yes, I mean good old-fashioned SOAP services, not the brand new, fancy REST or HATEOAS. How would you test this interaction? Would you put a layer or mock the services? My proposal was different from mocking the service. I thought that, if I setup a local server, just for sake of running tests, and record responses upfront, I could have a pretty highlevel, but still not integration level tests, which could help me to seal functionalities during the transition process to a new architecture. It’s similar a bit to the pact mock but still, do I really need a library for sth like this? A standard Fiddler + HttpListener can work just fine as well. Yes, the validation of the dependency against the recorded dialogue is hard to impossible, but still, what one gets is an easy way of testing your app without placing mocks all around the tests.

Even if you follow the pact mock path till the end, it’s worth considering that instead of dealing with mocks’ setup one can get the conversation recorded easily (just run your app) and reuse it later on. It may be not the unit test, but it can be the best test one can write being given the condition of communicating with other services.

Size of Nullable

What’s the size of nullable structure in C#? Does it depends on anything? How fields are held in the memory. Let’s consider following cases: bool?, int?, long?, Guid? One way to get the structure size is use of Marshal.SizeOf(Type). Unfortunately this method performs a check whether the passed type is generic. Every nullable is generic, hence we cannot use it. If you take a look at the implementation of this method, there is a call to the private method of the Marshal class, named SizeOfHelper. This method does not perform a check and can be easily used to calculate the size of the struct.

Nullable consists of two fields. The first is hasValue which answers to the question if the nullable has a value assigned. The other is the value. If a nullable has no value assigned, the value field will held default(T) value. How this members are aligned in the memory, does it depend on anything? What is the offset from the start of the structure of these specific fields?

To answer the two questions above (size and alignment) please take a look at the following table:

Type Size hasValue offset value offset
bool? 8 0 4
int? 8 0 4
long? 16 0 8
Guid? 20 0 4

The first two: bool? and int? are easy to come up with. Bool is equal to int, it takes 4 bytes to store one, so the offset of the value is 4.

What about long? Why does it take 16 bytes, not 12? Why the value starts at 8, not 4? That’s because of the struct alignment, which CLR performs to enable nice packing up the given struct. In other way, the struct is aligned to the length of the 64bit CPU registries.

The final example with Guid breaks the rule for long. Or maybe not? The struct size is multiplication of 8 bytes, so it’s totally ok for CLR to use 24 bytes as it is already aligned.

If you want to do some checks on your own, you may use the gist I created.