The batch is dead, long live the smart batch

It lurks in the night. It consumes all the energy. It lasts much too long. If you experienced it, you know it’s unforgettable. If you’re lucky and you did not meet it, you probably heard these stories from your friends. Yes, I’m talking about the batch job.

The old batch

This ultimate tool of terror was dreading us for much too long. Statements like “let’s wait till tomorrow” or “I think that the job didn’t run” were quite popular a few years back. Fortunately for us, with the new waves of reactive programming, serverless and good old-fashioned queues, it’s becoming thing of the past. We’re in a very happy position being able to process events, messages, items as soon as they come into our system. Sometimes a temporary spike can be amortized by a queue. And it works. Until it’s not.

When working on processing 2 billion events per day with Azure functions, I deliberately started with the assumption of 1-1 mapping, where one event was mapped as one message. This didn’t go well (as planned). Processing 2 billion items can cost you a lot, even, if you run this processing on-premises, that are frequently treated as “free lunch”. The solution was easy and required going back to the original meaning of the batch, which is a group, a pack. The very same solution that can be seen in so many modern approaches. It was smart batching.

The smart batch

If you think about regular batching, it’s unbounded. There’s no limit on the size of the batch. It must be processed as a whole. Next day, another one will arrive. The smart batching is different. It’s meant to batch a few items in a pack, just to amortize different costs of:

  1. storage (accessing store once)
  2. transport (sending one package rather than 10; I’m aware of the Nagle’s algorithm)
  3. serialization (serializing an array once)

To use it you need:

  1. a buffer (potentially reusable)
  2. a timer
  3. potentially, an external trigger

It works in the following way. The buffer is a concurrent-friendly structure that allows appending new items by, potentially, multiple writers. Once

  1. the buffer is full or
  2. the timer fires or
  3. the external trigger fires

all the items that are in the buffer will be sent to the other party. This ensures that the flushing has:

  1. a bounded size (the size of the buffer)
  2. a bounded time of waiting for ack (the timer’s timeout)

With this approach, used actually by so many libraries and frameworks, one could easily overcome many performance related issues. It amortizes all the mentioned costs above, paying a bit higher tax, but only once. Not for every single item.

Outsmart your costs

The smart batch pattern enables you to cut lots of costs. For cloud deployments, doing less IO requests means less money spent on the storage. It works miracles for the throughput as well, no wonder that some cloud vendors allow you to obtain messages in batches etc. It’s just better.

Next time when you think about a batch, please, do it,  but in a smart way.

Podsumowanie roku 2017

Rok 2017 dobiega ku końcowi. Jak zawsze, to dobry czas na podsumowanie.

Prelekcje

Jeżeli chodzi o prelekcje dużego kalibru, to ten rok był bogatszy od poprzedniego. Na 13 wystąpień (lista), aż 9 to wystąpienia konferencyjne! Dwa wystąpienia, to wystąpienia szczególnie ważne: .NET Developers Days oraz DevConf. Na obydwu konferencjach prezentowałem Performance That Pays Off (lekko zmieniony), na obydwu mówiłem po angielsku. To pierwsze takie wystąpienia. W kolejnym roku mam nadzieję na więcej 🙂

Książki

Po przeczytaniu 20 pozabranżowych książek niefabularnych zdecydowanie odczuwam różnicę w postrzeganiu wielu spraw. 20 książek to niby nic, jednak starannie dobrana lektura daje bardzo dużo. Plan na kolejny rok? Zdecydowanie. Ten poziom trzeba utrzymać.

Blog

Na blogu pojawiło się więcej wpisów. Część, to serie, jak na przykład bardzo dobrze odebrany Top Domain Model czy materiały związane z Service Fabric. Część to wpisy tradycyjne.

Szkolenia

To pierwszy rok, w którym uruchomiłem kilka płatnych szkoleń z tematu, który pojawiał się wiele razy na tym blogu: Event Sourcing. W kolejnym roku planuję wyjść ze szkoleniami otwartymi poza Warszawę (np. Wrocław). Daty otwartych szkoleń, jak i opisy znajdziecie tutaj.

Open Source

Początek 2017 to sterownik .NETowy do PostgreSQL i Marten. Dzięki moim optymalizacjom biblioteka, która pozwala zamienić Wam PostgreSQL w bazę dokumentową z opcją Event Sourcingu, przyśpieszyła znacząco. Więcej: tutaj.

Niestety środek roku, to dwa projekty, które skorelowały się w czasie ze swego rodzaju wypaleniem. Obydwa dotyczyły Service Fabric, ostatni to Lockstitch.

Azure Functions

Cyklem blogowym, który zyskał najwięcej podań dalej oraz like’ów było przetwarzanie 2 miliardów pozycji dziennie przy użyciu Azure Functions. Temat ten, będzie przeze mnie kontynuowany w kolejnym roku w kilku formach:

  1. szkolenie z Azure Functions
  2. prelekcje (o tym niedługo więcej)
  3. X – o tym niedługo znacznie więcej

Podsumowanie

To był super rok. Kolejny, planowany powoli, z rozmysłem, zapowiada się jeszcze lepiej. Do siego!

Different forms of drag

Have you heard about this new library called ABC? If not, you don’t know what you’re missing! It enables your app to do all these things! I’ll send you the links to tutorial so that you can become a fan as well. Have I tested it thoroughly? Yeah, I clicked through demo. And got it working on my dev machine. What? What do you mean by handling a moderate or high traffic? I don’t get it. I’m telling you, that I was able to spin an app within a few minutes! It was so easy!

Drag (physics) is a very interesting phenomenon. It’s a resistance of a fluid that behaves much different from regular, dry friction. Instead of being a stable force, the faster an object moves, the stronger the drag is. Let’s take a look what kind of drags we could find in modern IT world.

 

Performance drag

The library you chose works on your dev machine. Will it work for 10 concurrent users? Will it work for another 100 or 1000? Or, let me rephrase the question: how much of RAM and CPU it will consume? 10% of your application resources or maybe 50%? A simple choice of a library is not that simple at all. Sometimes your business have money to just spin up 10 more VMs in a cloud or pay 10x more because you prefer JSON over everything, sometimes it does not. Choose wisely and use resources properly.

Technical drag

You probably have heard about the technical debt. With every shortcut you make, just to deliver it this week, not the next one, there’s a non zero chance of introducing parts of your solution that aren’t a perfect fit. Moreover, in a month or two, they can slow you down, because the postponed issues will need to be solved eventually. Recently, instead of debt it was proposed to use the word drag. You move on with a debt, but moving with a drag, for sure will make you slower.

Environment drag

So you chose your library wisely. You know that it will consume a specific amount resources. But you know that it has a configuration parameter, that allows you to cut off some data processing or RAM usage or data storage costs. One example that automatically comes to my mind are logging libraries. You can use the logging level as a threshold for storing data or not. How many times these levels are changed only to store less data on these poor productions servers? When this happens, scenario for a failure is simple:

  1. cut down the data
  2. an error happens
  3. no traces beside the final catch clause
  4. changing the logging level for one hour
  5. begging asking users to trust us again and click one more time

This and similar stories heard tooooo many times.

Summary

There are different forms of a drag. None of them is pleasant. When choosing approaches, libraries, tools, choose wisely. Don’t let them drag you.

Azure Functions: processing 2 billions items per day (4)

Here comes the last but not least entry in a series, where I’m describing a few patterns that enabled me to process 2 billions items per day, using Azure Functions. The goal was to do it in a cost-aware and cost-wise manner, enabling fast processing with a small amount of money spent on this.

  1. part 1
  2. part 2
  3. part 3
  4. part 4

The first part was all about batching on the sender side. The second part, was all about batching on the receiver side. The third provided the way, to use Azure services without paying for function execution. The last part is about costs and money.

How much do I get for free?

When running under Consumption Plan, you get something for free. What you get is the following:

  • 400k GB-s – GBs means running with 1GB of memory consumed for 1s
  • 1 million executions

The first item is measured with 1 ms accuracy. Unfortunately for the runner,

The minimum execution time and memory for a single function execution is 100 ms and 128 mb respectively.

This means that even if your function could be run under 100ms, you’d pay for it. Fortunately for me, using all the batching techniques from the past entries, that’s not the case. I was able to run function for much longer, removing the taxation of a minimal run time.

Now the second measure. On average there’s over 2 million seconds in a month. This means, that if your functions is executed with smaller frequency, that should be enough.

How much did I pay?

Not much at all. Below, you can find a table from Cost Management. The table includes the writer used for synthetic tests, so the overall it should be much lower.

price

This would mean that I was able to process 60 billion of items per month, using my optimized approach, for 3$.

Is it a free lunch?

Nope, it’s not. There’s no such a thing like free lunch. You’d need to add all the ingredients, like Azure Storage Account operations (queue, table, blobs) and a few more (CosmosDB, anyone?). Still, you must admit, that the price for the computation itself is unbelievebly low.

Summary

In this series we saw, that by using a cloud native approaches like SAS tokens, and treating functions a bit differently (batch computation), we were able to run under a Consumption Plan and process loads of items. As always, entering a new environment and embracing its rules, brought a lot of goodness. Next time, when writing “just a functiona that will be processed a few millions times per month” we need to think and think again. We may pay much less, if our approach truly embrace the new backendless reality of Azure Functions.

 

Azure Functions: processing 2 billions items per day (3)

Here comes the third entry in a series in which I’m describing a few patterns that enabled me to process 2 billions items per day using Azure Functions. The goal was to do it in a cost-aware and cost-wise manner, enabling fast processing with a small amount of money spent on this.

  1. part 1
  2. part 2
  3. part 3

The first part was all about batching on the sender side. The second part, was all about batching on the receiver side. In this part we’ll move to truly backendless processing.

No backend no cry

I truly admire how solutions are migrated to the serverless world. The most interesting is observing 1-1 parity between components that were there before and functions that are created now, a.k.a “Just make it a func!”. If you see this, one to one mapping, there’s a chance that you’re migrating code without changing the approach at all. Let me give you an example.

Imagine that you need to accept users’ requests. These requests are extremely unlikely to fail (there are ways to model services towards that) and if they do, there’s a natural compensation action. You could think that using a queue to store them is a perfect way of accepting a request, that can be processed later on. OK, but we need a component that will accept these requests. We need something that will write to one of Azure Storage Queues, right? Wrong.

Tokenzzzzz

Fortunately for FaaS, Azure Storage Queues have a very interesting capability. They can be accessed directly with a limited scope of rights. This functionality is provided with SAS tokens that enable access to Add, Update and/or Process, and more. What you can do is to give somebody access to only Add messages, you can limit this access to 5 minutes (and revalidate if user can do it after this period of time). The options are limitless.

If we can limit the access to a queue to just adding messages, why would we need a function to accept it? Yes, we might need a function to issue a few tokens at the beginning but there’s no need of consuming a regular request and move it to a queue. No need at all. Your user can use a storage service directly with no code for putting data in there.

To put it even more bluntly: You don’ need a user to call a func to put a message in a queue. A user can can just put a message.

Cloud native

This moves us to being cloud native. To embrace fully different services and understand that using them no longer requires writing code for them. Your functions can easily move to a higher level, assigning permissions, returning tokens and shifting from being a regular app that “just was migrated to functions” to a set of “cloud native functions”, from “using services” to “orchestrating their usage”.

Where’s the cherry

We’ve got the cake. We need a cherry. In the last part, I’ll briefly describe costs and numbers. See you soon.

Azure Functions: processing 2 billions items per day (2)

This is the second blog post in a series in which I’m describing a few patterns that enabled me to process 2 billions items per day using Azure Functions. The goal was to do it in a cost-aware and cost-wise manner, enabling fast processing with a small amount of money spent on this.

  1. part 1
  2. part 2

In the first part you saw that batching can greatly lower the number of messages you need to send, and that it can actually broaden a selection of tools you can use to deliver the same value. My choice was to stick to good, old fashioned Azure Storage Queues as with the new estimated number of messages, I could simply use a single queue.

Serverless side

The initial code responsible for dispatching messages was simple. It was a single function using QueueTrigger, dispatching messages as fast as they go. Because of running under Consumption Plan, all the scaling was being done automatically. I could see a flood of log entries informing about functions being properly executed.

The test was run for a week. I checked the amount of money being spent in the new Cost Management tool and refactored the code a little bit. I was paying too much for doing lookup after lookup and spending too much time on finding data needed for the message processing. The new version was a bit faster and a bit cheaper. But it made me think.

If a single Table Storage operation takes ~30-40 ms, and I need to do a few for a single function run, what am I paying for? Also, I knew that the data are coupled temporarily. In other words, if one entry from a table was used for this message, it’s highly likely to be used within few seconds. Also, I did not care about latency. There was already a queue in there in front of it. I was fine whether the result will be presented within 1s or 5s. I asked myself: how can I use all these constraints in my favor?

Processing batches in batches

The result of my searches was as simple as that. Why don’t process messages already containing batched entries in batches as well. I could use a TimerTrigger to get this function run every 5/10 s and grasp all the messages using a batched operation GetMessages from Azure Storage Queues. Once, they are fetched, I could be able to either prefetch all the required data using parallel async operations with Task.WhenAll or use a local cache for the execution.

Any side effects of dispatching messages on my own? Good poison message handling and doing some work that was internally handled by QueueTrigger.

The outcome? A single function running every x seconds, draining the queue till it’s empty and dispatching loads of messages.

Was it worth it? The total time spent previously by functions could have been estimated as

total_time = number_of_messages * single_message_processing_time

where single_message_processing_time would include all the lookups.

With the updated approach, the number of executions was stable (~15k per day) with different processing times, depending on the number of messages in the queue. The most important factor was the amortized cost of lookups and storage operations. The final answer was: yes, it was definitely worth it as it lowered the price greatly.

Moving on

In this part we saw that the batching idea leaked to the serverless side, beautifully lowering the time and the money spent on the function execution. In the next part we’ll see the power of backendless.

Azure Functions: processing 2 billions items per day (1)

In this series I’ll describe a few patterns that enabled me to process 2 billions items per day using Azure Functions. Yes 2 billions items per day. The aim of this trial was not to check whether you can do it with Azure Functions. You can do it easily. The goal was to do it in a cost-aware and cost-wise manner, enabling fast processing with a small amount of money spent on this.

Initial phase

The start point was simple. To have a single queue, in my case Azure Storage Queue, and simply enqueue items to it, and run processing on a Consumption Plan. This looked pretty nicely. If you ever try Azure Functions you’ll see the ability to scale up instances when needed, just to make your workload processed in a timely manner.

I must admit that I skipped that part. When you calculate the number of operations that a single queue can handle, it won’t be enough to cope with 2 billions item per day. Yes, you can scale to multiple queues or use a different kind of queue. This was not the case for my experiment though.

It comes in batches

The important part that I intentionally didn’t mention, was the fact, that the numbers of items’ producers was limited. Also, they were able to batch items and flush them once in a while. With this assumption I was able to use a dense serialization protocol (big no no for JSON) and fill every single message that is being sent with hundreds, sometimes, thousands of items to get them processed.

In my case this lowered the number of messages greatly, by a factor of 1000, leaving the whole thing working as it was supposed to. Yes, the receiving part become a bit different as it was required to deserialize the densely packed payload properly.

You may ask why not Event Hubs? Being able to pack data on my own, being given the possibility of a delayed write and comparing prices for the scale I talk about, Azure Storage Queues with a properly selected serializer still won in my calculations.

Cheating Seeing opportunities

This was the first opportunity that I used to make the processing faster and cheaper. We saw that using a batch (smart-batching in this case) greatly lowered the number of moving pieces, still delivering the same value. In the following entry, we’ll move a bit deeper into the solution I built.