Sewing Machine for Service Fabric


This is an introductory post for my new Open Source journey project, Sewing Machine that aims at extending capabilities of Azure Service Fabric services in various ways. It’s focused on speed and performance, but also aims at delivering more capabilities build on top of the powerful Service Fabric foundations.

Services and actors

Service Fabric provides various capabilities. It can host processes and containers, enabling you to control resources usage on a much more granular level. You can host multiple processes on one VM, if they don’t require that much CPU/memory. Beside this, SF provides following models for writing your applications:

  • stateless services
  • stateful services
  • actor framework

These parts will be covered in the forthcoming posts.

Can we do more

I’ve spent some time reading official docs and decompiling Fabric sources and found that there are possibilities of building more on top of these strong foundations. Does more mean better? Possibly yes. I think that in Sewing Machine I’ll be able to allocate much less, pin no memory and use better serialization. This is low level. On a higher level I hope for:

  • event sourced actors
  • better use of secondary replicas (like using them for running processes etc)
  • multi actor projections

Journey, not path

Sewing Machine is in its initial phase. This is another journey project of mine as it’s a project based on uncovering what’s behind Service Fabric. I will share my findings in following blog posts and work towards reaching the aims above. This also means that version 1.0 is highly unlikely to be published within a month or two. For sure this journey will take a bit more and hopefully findings will be good enough to release it.

I hope you’re eager to do some sewing with this fabric. I am.


Async programming model


This is a follow up post to Async pump for better throughput in Azure. Please read the first before moving forward.


I’ve been given a lot of feedback about my Async pump post. In a few cases this blog post from Ayende was quoted as it describes exactly the same approach. You can read the post, but more meaningful are comments provided by Kelly Sommers and Clemens Vasters.

The model

The await statement has simple semantics. It breaks your code and schedules the following part as a task continuation. This heavy lifting is done on the C# compiler level, so you don’t have to worry about. The model of this extension is simple: define a task and a continuation. Nothing more, nothing less. With my approach, that was a “trick”

That’s the premise of the “trick” that is allegedly achieving parallel execution of I/O and compute work here. That is, however, not the purpose of the asynchronous programming model and of the Windows IO completion port (IOCP) model. The point of IOCP is to efficiently offload IO work from user code to kernel and driver and hardware and not to bother the user code until the IO work is done

by Clemens Vasters

What it basically says is once you await on IO operation, your code that is run after, is scheduled on the IO thread

As IO typically takes very long and compute work is comparatively cheap, the goal of the IO system is to keep the thread count low (ideally one per core) and schedule all callbacks (and thus execution of interleaved user code) on that one thread. That means that, ideally, all work gets serialized and there minimal context switching as the OS scheduler owns the thread. The point of this model is ALREADY to keep the CPU nicely busy with parallel work while there is pending IO.

by Clemens Vasters

So with your code awaiting some IO, you’ll call IO, next the code after await is executed on the IO thread, as it’s assumed to be lightweight, another IO occurs, again the same thread dispatched the callback. With the async pump I proposed, comes a danger, as when we follow this partially awaitless approach the continuation code is

not on a .NET poll thread, but an IO pool thread. The strategy above sits on that IO pool thread past the point where you are supposed to return it (which is the next IO call) and thus you might force the IO pool to grow

by Clemens Vasters


Measure first. Think. Then think again. The “trick” worked in my case, improving performance for initialization of one app (I needed it in the beginning). Does it work in every case and should be used in general, I’d say no. It’s good to follow the programming model. If you don’t want to, you must have strong reasons for it.

Very good morning


This is one more personal post after my latest description of my learning pattern. I’d like to share how I start my morning and how it improved my overall quality of whole day.

Morning schedule

  • 5:45 AM – it’s time to get up and prepare and eat the oatmeal: oat flakes, a few raisins and hot water
  • 6:00 AM – by this time I’ve got my breakfast eaten and it’s reading time!
  • 7:00 AM – workout time (some weights or a stationary bike)
  • 7:50 AM – post workout shake
  • 7:55 AM – shower
  • 8:10 AM – I’m starting to waking up my child. It takes up to 15 minutes of playing music (sometimes singing is included)
  • 8:40 AM – ready for the kindergarten!
  • 9:00 AM – my second breakfast
  • 9:30 AM – with a cup of coffee I’m starting my work

I’ve been changing this schedule a little bit for 5 months now. The current version might be a subject to change, but the overall shape won’t change. Why?

Why so early?

I’ve always been an early bird, but a few months ago I noticed loosing focus. Keeping a morning like this, introduces some good routine/regime to my life and lets me maintain focus for my whole work day. Waking early means going to bed ~10-11PM, but occasionally I stay awake till the midnight (once in a while). Additionally, before I start my work, I got my workout, some reading, two breakfasts checked. This is a really good start!


Is this a way to go for everyone? Probably not. On the other hand, living in a world of interruptions and notifications does not improve our ability to focus. Having a morning like this – definitely does.

My learning pattern


This is a personal entry about my framework that I use subconsciously for learning new things. I’ve applied it recently when learning about Azure Service Fabric, and then I realized what I did. Next, I thought that it could be useful to share the way I learn new frameworks/tools/libraries. Are you focused?

Azure Service Fabric is the new black

Azure Service Fabric is a new environment for creating distributed apps for both, cloud and on-premise environments. It provides a lot of tooling with different behaviors (stateful, stateless services, actors). I’ve been working with Azure Storage Services for some time and two weeks ago I started learning Service Fabric. From zero (beside my distributed systems know-how and some exp with a public cloud).

Reading docs till you know it all

One may say that you don’t need to know everything about a technology to use it. Yes, you don’t. My aim is not to use a technology but to know it, to immerse into it, to embrace all the patterns behind it natively. Once I got it, it’s like a new language.

In case of Service Fabric it was simple. There is an official doco page which takes a few hours to read. I did it. In my opinion you truly want to learn vocabulary, patterns, properties to embrace one thing. That’s why I read some of the pages more than once.

F12 till it hurts

It wasn’t a point where I was ready to write my first program. I was following the framework code going with F12 (go to the implementation) as deep as possible. Sometimes it requires taking notes, but moves you forward with an amazing speed. That’s how I get the real knowledge how it works.

Write an extension, no dummy sample

The final step is to either send a PR or maybe write a simple extension to the framework. No mambo-jumbo playing with a dummy sample running one actor. I need to prove that foundations are strong. That’s how I rebuilt Actor’s part of the Service Fabric to be somewhat faster.


This is the approach I use for learning new technologies. Total immersion and getting as deep as possible, as soon as possible. Sometimes it hurts, but still, gains are tremendous!

Performance matters


This is a short follow-up post about Marten’s performance. It shows that saved allocations are not only about allocations and memory. It’s also about you CPU ticks, hence the speed of your library.

Moaaaaar performance !

Let me present you three pictures comparing performance before and after removing a lot of allocations. They were provided by Jeremy after benchmarking my PRs to Marten. My work was purely focused on allocations, but additionally, as shown below, it improved Marten’s speed of execution.


The speed improvement isn’t that significant, but please take a look at the allocated bytes. Now it takes much less memory required before



The new insert is 10% faster and takes much less memory than before.


Bulk inserts

Here, after enabling npgsql library to accept ArraySegment<char> I was able to reuse the same pooled writer. The new approach not only skips allocations but also leases a pooled writer only once. Just take a look at these numbers!



When working on a library or a tool it’s good to think about performance and memory consumption. Even in a managed Garbage Collected world, using pooling for buffers or objects at all might not only reduce a memory consumption but also improve the overall speed of your creation.


Marten: my Open Source experience


This post describes some of my Open Source experiences when working on Marten, a Polyglot Persistence for .Net Systems using the Rock Solid Postgresql Database (TM). This isn’t me boasting, but rather sharing a pleasant story about getting involved in some solid OSS.


The main person responsible for Marten and the main committer is Jeremy D. Miller. He’s a well known person in .NET Open Source world as the author of StructureMap. I had a pleasure working with him before, in OSS area, when we were trying to create a better NuGet client, called Ripple. Before creating Marten, he spent some time working with RavenDB. He described it thoroughly in Would I use RavenDB again? and Moving from RavenDB to Marten. These entries are pure gold, I encourage you to read them.

What is Marten

Marten is a client library for .NET that enables you to use Postgres database is two ways:

  1. as a document database
  2. as a event sourcing database

Both are included in the same library, as they share a lot. The very foundation of Marten is Postgres’ ability to process JSON and treating it as the first class citizen. First class, you say, what does it even mean? You can use db parameters with JSONB type, indexes (that are truly ACID) and a lot more. Being given that Postgres is a mature database with a well performing storage engine, you can build a lot on top of it.

Performance matters

I started to work on Marten after seeing the JSON serializer seam. It contained only method for serializing objects, that was returning a string


This means, that for every operation that involved obtaining a serialized version of an object (both: a document and an event), a new string would be allocated. For low traffic this could be good, but when operating on many documents or appending a lot of events, it isn’t.

We had a short discussion in this issue which was followed with a PR of mine where I introduced a buffer pool, that provides TextWriters and uses it for upserting documents in the database. I chose to work with the TextWriter abstraction as all good .NET JSON serializers are capable to work with it. I encourage you to follow the PR and the issue, again it shows how a bit excessively described issue can help in communication and making things happen.

Performance matters even more

I wanted to introduce the same pooling for appending events and bulk inserts. This was blocked by the npgsql Postgres .NET driver though. Because of its internal structure, when using array parameters, this driver could not extract the size of a passed text from the parameter. It wasn’t supporting arrays of ArraySegment<char> as parameters as well… I issued the npgsql related PR to fix it and pinged Shay, how fast can we get it. It was released under a week.

Meanwhile, I provided PRs for events and bulk inserts that were RED on TeamCity. Once the new client library npgsql 3.2 has been released, Jeremy merged these two PRs making Marten allocating a lot less on the write side.

No company, lots of work done

Jeremy, Shay and me – we share no manager, no company and we work in different time zones. Still, as a group, we were able to deliver a meaningful performance improvements for a library that we care for. This shows a true power of Open Source.


I hope you enjoyed this possitive journey of mine. This summary does not mean I finished working on Marten. I’d say that it’s totally opposite.

Async pump for better throughput in Azure

This post is followed up by


Introducing async-await has changed a lot. Now, with some compiler’s help we’re able to squeeze out more throughput from our machines, which may lower costs and increase throughput. In this blog post we’ll push the boundaries even further by questioning the need of immediate awaiting on a task.


The story behind this pattern is simple. I’m using a part of the Azure Storage Services, the page blob. It provides storage API targeted at random IO & page aligned reads and writes. This is a perfect solution for emulating disk IO (it’s used for VMs’ disks), but could be used in cases where you want to have ability to write to a file in Azure, under specific index. You can just read a page, modify it and write it back. If you’re interested in this topic, take a look at my talks, maybe we’ll meet during my presentation Keep Its Storage Simple Stupid.


It’s obvious that for reading/writing from Storage Services you want to use asyncified code. Using blocking calls like the one below freezes a thread, which considering the money you already pay, is not the best option. Remember, it’s the cloud and regular Storage Services are backed up by HDD disks. It might take a while. Still, let’s take a look at the sync version first. We’ll operate on a stream that has been opened from the blob.


The method above reads the buffer from a stream. It dispatches a read after a read to fill the buffer.


The async-ified version of this reader is not that different. It uses just async Task in its signature and awaits one the Read. We’ll have no blocking calls, leaving some spare CPU cycles for other operations.


Asynchronous pump

In the last attempt, we need to ask what do we read this buffer for? In my case, that’s for scanning over its content. I need to read a page blob from a given position and scan/deserialize it in a C# code. I do not want to preserve the buffer. As soon as I read it, I can move on. It’s just about reading a log, nothing more. The second property of it is: entries in this log are aligned to pages, so are well aligned for reading. Can we modify the reading part then?

We could think of using two buffers/streams. Schedule a read in the first, then in a loop:

  1. schedule a read in the second
  2. await on the first
  3. swap firstsecond
  4. go to 1

If we used this algorithm, we’d have a higher probability of one operation being already ended and ready to be dispatched. This means that our algorithm, possibly, could work on prefetched data without any interruptions, having the data ready when it needs it. For sake of simplicity, the buffer array, ReadBuffer were closed in a simple helper class called Buffer.



Having something ready to be awaited does not mean that you should await it immediately. Using this two buffer approach can increase the throughput of your algorithm by ensuring that data are fetched before they are needed. Now it’s time for you to search for some pumping potential in your code!