Merge request policy

If you’re in a company with multiple teams in IT department, you could’ve been considering using GitHub Enterprise or its free replacement, GitLab. Beside providing Git hosting experience both can support you with one aspect previously unknown to your organization: pull/merge requests.

A pull request in GitHub, a merge request in GitLab provides the same functionality. They let other service users to issue demands of changes in a way that makes it easy to apply onto the original repository. The advantages of using this kind of change management over sending diffs with emails or other ways of applying fixes in other codebases are:

  1. Traceability – a request has its URL, is linkable and is public
  2. Permission granularity – everyone can be given a permission to read and fork repo but not to write. This lets the owners to stay owners and deal with issued requests rather than a broken codebase because of mistakes made by others (for particular flows, read here)
  3. True ownership – an owner is released from digging the dirt and moved to the acceptor state
  4. No more emails – email requests are no longer sent. The basic way of asking for a given change is… doing the change
  5. Learning over abusing – the requester is given an opportunity to leave with other codebase. It can cost a bit more, that’s for sure, but it spreads practices and knowledge. The most important thing for owner is to help to create pull requests, not to apply them onto repositories.

This kind of change can be painful for lazy people, which want to delegate, or rather, push away all their work with no insight about the requested changes. This can be a big learning opportunity as well. Adopting OSS rules, like merge-request-policy in your day-to-day job can increase your awareness and make you a happier developer.

It’s time to issue some pull requests!

Multidatacenter Cassandra cluster with slow cross DC connection

I’d like to discuss a particular failure scenario for a multi datacenter Cassandra cluster.
The setup to reproduce is following:

  • Two Cassandra data centers
    • DC1: n nodes
    • DC2: m nodes
  • TestKeyspace
  • NetworkTopologyStrategy with replication factors:
    • DC1: n (each key on each node)
    • DC2: m (each key on each node)
  • Tables in TestKeyspace are created with default settings
  • hinted hand-off enabled
  • read repair enabled

The writes and reads goes to the DC1. What can go wrong when whole DC2 goes down (or you get a network split)?
It occurs that read_repair is defined not by one but two probabilities:

What’s the difference between them? The first one shows probability of read repair across whole cluster, the second – rr across the same DC. If you have an occasionally failing connection, or a slow one using the first can bring you some troubles. If you plan for multi DC cluster and you can live with periodical runs nodetool repair instead of failing some of your LOCAL_QUORUM reads from time to time, switch to dc read repair and disable the global one.

For curious readers the class responsible for performing reads with read-repairs as well is AbstractReadExecutor

Bounded context in deployment tools

Recently I’ve been moving around the topic of a deployment. Imagine a situation you’re being given a set of scripts, or script like objects used to deploy a set of applications. The so-called scripts are from the very basic like create-directory to complex, rooted in an organization infrastructure and tooling. Additionally, some of them are defined as groups of other scrips. For example, installing an application service starts with creation of a directory, then binaries are copied and finally, the service is registered.
The scripts are not covered with tests, but are hardened by years of successful usage. One could consider to rewrite them totally, and provide a full blown set of tests. This may be hard, as you throw away all the knowledge hidden behind scripts. Remember, that there were big companies that are no longer here, take Netscape as example.

I’ve spent quite a while about considering chef, PowerShell, Pupper even the msbuild with its tasks. What helped me to make up my mind was the famous Blue Book. Why not to consider a set of scrips as a bounded context? Just take a look at the picture provided by Martin Fowler here. Wrap all the older scripts in a context bubble providing mapping, mostly intellectual, to all the terms that are needed to be known outside. It’s more than wrapping all old scrips with an interface. There is a need of a real mapping with a glossary to let people which do not want to leave the bubble now exist in it for a while. What tool will be used for the new bounded context communicating with the other? That’s an open question. I’ll try to choose the best tool, with good enough test support. The only real requirement is the ability to provide the mapping to the old deployment tools’ context.

If you want to learn more, just take a loot at the great Eric Evan’s videos under this link: http://dddcommunity.org/?s=four+strategies

AzureDirectory – code review

The project AzureDirectory provides an Azure implementation of the Lucene.NET abstraction – Directory. It targets in providing ability to store Lucene index in the Azure storage services. The code can be found in here AzureDirectory . The packages can be found on nuget here. There aren’t marked as prereleases.

The solutions consists of two projects:

  1. AzureDirectory
  2. TestApp

The first provides implementation of the Lucene abstractions. The are a few classes, only needed for the feature implementation (implementations of Lucene abstractions). Additionally some utils class are introduced.
The code is structured with regions, which I personally dislike. Names of regions like: CTORS, internal methods, DIRECTORY METHODS shows the way the code is molded, with no classes holding common wrapped in region functionality. The lengthy methods and ctors are another disadvantage of this code base.
The spacing, using directives, fields that may be readonly are messy. Something which may be cleared with a ReSharper Clean Code is left for a reader to deal with.
You can find in there usages of Lucene obsolete API (like IndexInput.Close in disposal), as well as informative comments like:

// sometimes we get access denied on the 2nd stream…but not always. I haven’t tracked it down yet
// but this covers our tail until I do

It’s good and informative for author but leaving the project in an immature state.

The second project is not a test project but a sample app using the lib. No tests at all.

Summing up, after consideration, I wouldn’t use this implementation for my production Azure app. The code is badly composed, with no tests and left with comments pointing at situations where authors are aware of the unsolved-yet problem.

The missing 20% of configuration with Octopus Deploy

Recently I’ve been evaluating Octopus Deploy. I wanted to learn more about a platform which is quite unique for .NET environment. After getting through the features I wrote following tweet:

What’s about the missing 20%?
The missing part is the configuration considered as an artifact. There are many projects when changing you code makes you change your configuration. Adding some values is ok, much more important are operations like swapping whole sections, etc. This kind of changes are frequently connected with the given code change, hence they are changed in the very same commit and pushed to VCS. Then, your configuration is no longer a simple table with selection of dimensions where the given value should be applied. The additional skipped dimension is time measured in the only unit VCS is aware of – commits. It’s like “from now on I need this values”.
What you can do is to use Octopus values to point to the right file for the given environment, that’s for sure. The thing becomes a bit more tricky when for instance production config should not be leaked into the development repo, etc.
This leads to the fact, that your configuration is an artifact. In many cases it can be easily replaced by a table with ‘environment’ dimension but still, it is an artifact, now, unfortunately not stored in your repository.

The reasoning above is not meant to lead you astray. Octopus is a great deployment tool. As with every tool, use it wisely.

Agile team analogy

Yesterday I’ve had a discussion about ability to introduce a Scrum-based development in a big organization. There was a question raised whether it’s possible to construct a cross-functional team and deliver shippable product with no busy waits, no “I need this from that guy” syndrome.
I thought about each team member as a team part, consuming some input (mails, requests, discussions) and producing some output. Nowadays it’s much more possible to make it asynchronous, to remove all the meetings, long-running discussions, to review code of your colleagues in a post-commit manner. Having that said, you can look at each person in producer-consumer manner and the graph of the dependencies is the bus shifting the artifacts you create. What it takes to create a good team is to couple all the people which can be waiting for the output of others and collocate them in one team. The obvious producers (with much less input) would be product owners, hence, they’d be feed by testers for sure. If you have some kind of core system in your company, a guy working with it would be a perfect fit as well. Just try to collocate all the people with transitive producer-consumer dependencies. Choose the most critical and time consuming ones. Just make a bubble with a minimum input, possibly asynchronous. What about output? That would be, according to your definition of done, your shippable product. Nothing more nothing less. That’s what all Agile is all about, isn’t it?

Your local user groups

Recently I’ve been deeply involved in Warsaw .NET User Group. What makes you a deeply involved you ask? I’d say that resolving current problems would be the answer that fits the most. We covered a few important points like getting some sponsorship, being given a few tickets for Build Stuff conference (thanks!) and running snacks sessions (short presentations, for those who want to start with their presentations). It looks like people are a bit more energized and active. That’s for sure the right direction for any user group.
I want to encourage all of you, just make a small move, do sth for the community you’re chosen “to be involved with”, for instance ask for a problem to be resolved. It’s a win-win “by people, for people”, nothing more, nothing less.