Cloudy cost awareness

TL;DR

Our industry was forgiving, very forgiving. You could not put an index, run a query for 1 minutes and some users of your app would be disappointed. If you were the only one on the market or delivered banking systems, that was just fine as you’d no loose clients because of it. The public cloud changes it and if you can’t embrace it, you will pay. You will pay a lot.

Pay as you crawl

If you issue a query scanning a table in Azure Table Storage, every entity you access will be counted as a storage transaction. Run millions of them and your bill will be increased. Maybe not that much, but it will.

If you deploy a set of services as Azure Cloud Services, each of them consuming just 100MB of memory, your VMs will be undersaturated. You’ll pay for memory you don’t use and CPU that just sits in the rack that hosts your VM.

Design is money

Before public cloud, all these inefficiencies could be more or less tolerated, but were not that easy to spot on. Nowadays, with a public cloud, it’s the provider, the host that will notice them and charge you for them. If you don’t design your systems with the awareness of the environment, you will pay more.

Mitigations

This is not black or white situation. It never is. You’ll probably be able to dockerize some parts of your app and host it inside of Service Fabric cluster. You’ll probably be able to use CosmosDB and its autoindexing feature to just fix the performance for lookups in your Azure Storage Tables. There’s a lot of ways to mitigate these effects, but still, I consider a good appropriate design as the most valuable tool for making your systems not only well performing and effective but, eventually, cheap.

Summary

Don’t throw your app against the wall of clouds and check if its sticks. Design it properly. Otherwise, it may stick in a very painful and cost ineffective way.

Why you should and eventually will invest your time in learning about public cloud?

TL;DR

Within 2-5 years the majority of applications will be moved to public cloud. You’d better be prepared for it.

Economies of scale

You might have heard that economy of scale does not work for software. Unfortunately, this is not the case for public cloud sector. It’s cheaper to buy 1000000 processors than to buy one. It’s cheaper to buy 1000000 disks than to buy one. It’s better to resell them as a service to the end customer. And that’s what public cloud vendors do.

Average app

The majority of applications does not require fancy processing, or 1ms service time. They require handling peaks, being mostly available and costing no money when nobody uses one. I’d say, that within 2-5 years we will all see majority of them moving to the cloud. If there is a margin, where the service proves its value and it costs more than its execution in the cloud, eventually, it will be migrated or it will die with a big IT department running through the datacenter trying to optimize the costs and make ends meet.

Pure execution

The pure execution has arrived and its called Azure Functions (or Lambda if you use the other cloud:P ). You pay for a memory and CPU multiplied. This means that when there’s nothing to work on, you’ll pay nothing (or almost nothing depending on the triggering mechanism). This is the moment when you pay for your application performing actions. If an app user can pay more than the cost of the execution, you’ll be profitable. If not, maybe it’s about time to rethink your business.

Performance matters

With this approach and detailed enough measurements you can actually can see where you spend the most money. It’s no longer profiling an app for seeing where is it slow or where it consumes most of the memory. It’s about your business burning money in different places. Whether to update one or not – it’s a decision based on money and how much does it cost to fix it. With highly profitable businesses you could even flood your less performing parts with money. Just like that.

Environments and versioning

How to version a function? Preserve signature and rewrite it. Then deploy. Nothing less nothing more. I can almost see a new wave of development approaches where Continuous Delivery looks like a grandpa trying to run with Usain Bolt. You can’t compete with this. It’s a brand new league.

Summary

If you think about areas you should invest your time, public cloud and functions are the way to go. For majority of the cases, this is going to be vital to survive in the market competing and betting for the lowest costs of infrastructure, IT and devops.

The cost of scan queries in Azure Table Storage

There are multiple articles describing the performance of Azure Table Storage. You probably read the entry of Troy Hunt, Working with 154 million records on Azure Table Storage…. You may have invested your time in reading How to get most out of Windows Azure Tables as well. My question is have you really considered the limitations of the queries, specifically scan queries and how they can consume the major part of Azure Performance Targets.

The PartitionKey and RowKey create the primary and the only index in ATS (Azure Table Storage). Depending on the query the following kinds can be distinguished:

  1. Point Queries, which are queries to retrieve a single entity by specifying a single PartitionKey and RowKey using equality as predicate
  2. Row Range Queries, which  are queries to get a set of entities defined with the same PartitionKey and a range of RowKeys
  3. Partition Range Queries, which are run with a range of ParitionKeys
  4. Full table scans, which have no predicate for ParitionKey

What are the costs and limitations of the following queries? Unfortunately, every row that is accessed by the query to perform scan over will be counted as the table operation, Tthere ain’t no such thing as a free lunch. This means, that if you scan your entire table (4th scenario), you’ll be able to process no more than 20,000 entities per second. This limits the usage of large data sets’ scans. If you have to model queries across different keys, then you may consider storing the same value twice: once under the natural Parition/RowKey pair and the second time to match the other index, to create an inverted index. If any case, you’ll have to scan through the entire data set, then using ATS is not the way to go, and you should consider some other ways of modelling your data, like asynchronous copy data to blob, etc.

Lokad.CQRS Retrospective

In the recent post Rinat Abdullin provides a retrospective for Lokad.CQRS framework which was/is a starting point for many CQRS journeys. It’s worth to mention that Rinat is the author of this library. The whole article may sound a bit harsh, it provides a great retrospection from the author’s and user’s point of view though.

I agree with the majority points of this post. The library provided abstractions allowing to change the storage engine, but the directions taken were very limiting. The tooling for messages, ddd console, was the thing at the beginning, but after spending a few days with it, I didn’t use it anyway. The library encouraged to use one-way messaging all the way down, to separate every piece. Today, when CQRS mailing lists are filled with messages like ‘you don’t have to use queues all the time’ and CQRS people are much more aware of the ability to handle the requests synchronously it’d be easier to give some directions.

The author finishes with

So, Lokad.CQRS was a big mistake of mine. I’m really sorry if you were affected by it in a bad way.

Hopefully, this recollection of my mistakes either provided you with some insights or simply entertained.

which I totally disagree with! Lokad.CQRS was the tool that shaped thinking of many people, when nothing like that was available on the market. Personally, it helped me to build a event-driven project (you can see the presentation about this here) based on somehow on Lokad.CQRS but with other abstractions and targeted at very good performance, not to mention living documentation built with Mono.Cecil.

Summary

Lokad.CQRS was a ground breaking library providing a bit too much tooling and abstracting too many things. I’m really glad if it helped you to learn about CQRS as it helped me.  Without this, I wouldn’t ask all the questions and wouldn’t learn so much.

The provided retrospective is invaluable and brings a lot of insights. I’m wishing you all to make that kind of ground breaking mistakes someday.

It’s getting cloudy, isn’t it?

I’ve just finished the Azure workshop. In two days course you cannot get everything, but as far as I know, the discussed topics can show what Azure is all about. I won’t rewrite plenty of blog entries and articles. What I want is to write that the cloud is the future. By the cloud I do not mean Azure, I mean the paradigm allowing you to scale as hell, to manage you site performance on the very organic level (“too much sugar – more insulin”). There is only one danger I can imagine and it’s not the security of your data. Imagine a situation that having such a scaling environment one can improve performance of his application with scaling rather then finding a bug running a 100 additional queries in each request. I hope that programmers’ culture will evolve and will disallow such behavior.