ProtobufRaw vs protobuf-net

TL;DR

I’m working currently on SewingMachine, an OSS project of mine, that is aimed at unleashing the ultimate performance for your stateful services written in/for Service Fabric (more posts: here). In this post I’m testing further (previous test is here) whether it would be beneficial to write a custom unmanaged writer for protobuf-net using stackallocked memory.

SewingMachine is raw, very raw

SewingMachine works with pointers. When storing data, you pass an IntPtr with a length as a value. Effectively, it means that if you use a managed structure to serialize your data, finally you’ll need to either pin it (pinning is notification for GC to do not move object around when it’s pinned) or have it pinned from the very beginning (this approach could be beneficial if an object is large and has a long lifetime). If you don’t want to use managed memory, you could always use stackalloc to allocate a small amount of memory on stack, serialize to it, and then pass it as IntPtr. This is approach I’m testing now.

Small, fixed sized payloads

If a payload, whether it’s an event or a message is small and contains no fields of variable length (strings, arrays) you could estimate the maximum size it will take to get serialized. Next, instead of using Protobuf-net regular serializer, you could write (or emit during a post-compilation) a custom function to serialize a given type, like I did in this spike. Then it’s time to test it.

Performance tests and over 10x improvement

Again, as in the previous post about memory, the unsafe stackallock version shows that it could be beneficial to invest some more time as the performance benefit is just amazing . The raw version is 10x faster. Wow!

protoraw_vs_proto

Summary

Sometimes using raw unsafe code improves performance. It’s worth to try it, especially in situations where the interface you communicate with is already unsafe and requiring to use unsafe structures.

ThreadStatic vs stackalloc

TL;DR

I’m working currently on SewingMachine, an OSS project of mine, that is aimed at unleashing the ultimate performance for your stateful services written in/for Service Fabric (more posts: here). In this post I’m testing whether it would be beneficial to write a custom unmanaged writer for protobuf-net, instead of using some kind of object pooling with ThreadLocal.

ThreadStatic and stackalloc

ThreadStatic is the old black. It was good to use before async-await has been introduced. Now, when you don’t know on which thread your continuation will be run, it’s not that useful. Still, if you’re on a good old-fashioned synchronous path, it might be used for object pooling and keeping one object per thread. That’s how protobuf-net caches ProtoReader objects.

One could use it to cache locally a chunk of memory for serialization. This could be a managed or unmanaged chunk, but eventually, it would be used to pass data to some storage (in my case, SewingSession from SewingMachine). If the interface accepted unmanaged chunks, I could also use stackalloc for small objects, that I know how much memory will be occupied by. stackalloc provides a way to allocate some number of bytes from the stackframe. Yes, it’s unsafe so keep your belts fastened.

ThreadStatic vs stackalloc

I gave it a try and wrote a simple (if it’s dummy, I encourage you to share your thoughts in comments) test that either uses a ThreadStatic-pooled object with an array or a stackallocated and writes. You can find it in this gist.

How to test it? As always, to the rescue comes BenchmarkDotNet, the best benchmarking tool for any .NET dev. Let’s take a look at the summary now.

local_vs_threadstatic.png

Stackalloc wins.

There are several things that should be taken into consideration. Finally block, the real overhead of writing an object and so on and so forth. Still, it looks that for heavily optimized code and small objects, one could this to write them a bit faster.

Summary

Using stackallocated buffers is fun and can bring some performance benefits. If I find anything unusual or worth noticing with this approach, I’ll share my findings. As always, when working on performance, measure first, measure in the middle and at the end.

ProtoDescriptor

Recently I’ve created (with some porting from another project) a simple library which allows parsing .proto files and storing them in a model. The library offers serialization/deserialization of the mentioned model. I hope I ship dynamic genaration of protobuf-net classes as well. It would allow creation of self-desriptive streams (contract added at the very beginning of a file) discoverable via reflection (you got class, you IEnumerable of this class’ object) and queryable. It has some potential in it.

https://github.com/Scooletz/ProtoDescriptor

Protobuf-linq

I had an idea about querying and projecting over big streams of messages serialized with Google Protocol Buffers. If one needs only a few fields to his/her projection, why don’t make it implicit and prepare an optimal way of deserializing only these fields? That’s the way Protobuf-linq has been born. It’s simple, fast and eager to help you iterate over big streams of data.
Check it out!

Protobuf-net: inheritance of messages

The last post was an introduction to a simple project called Protopedia, located here. The project is destined to bring in a simple manner, probably one test per case, solutions for complex scenarios like versioning, derivation of messages, etc. As the versioning was described by the previous entry, it’s right time to deal with derivation.

Inheritance
It’s well known fact that one should favor composition over inheritance. Dealing with derivation trees with plenty of nodes can bring any programmer to his/her knees. How about messaging? Does this rule apply also in this area? It’s common for messages to provide a common denominator, containing fields common for all messages (headers, correlation identifiers and so on), especially if they’re meant to be sent/saved as a stream of messages of the base type (example: Event Sourcing with events of a given aggregate). Using a set of messages with a distilled root greatly simplifies concerns mentioned earlier. Consider the following scenario, serialization of a collection of A messages (or its derivatives) being given the following structure:

Message inheritance tree for example

How would Protobuf-net serialize such collection? First, take a look at the folder from Protopedia. You can notice, that all the classes: A, B, C, have been mapped with different types. It’s worth to notice the ProtoInclude attributes with tag values of the types located one level deeper in the derivation tree. The second important thing is the values of the derived type tags, which do not collide with tags of the class fields. In the example, you can find a constant value of 10 used for sake of future versions of the root, the A class. As one can see in the test of the derivation, the child classed of the given class are serialized as fields with the tags equal to the tag passed in the ProtoInclude attribute. To see the fields composed in a way the Protobuf-net serializes inherited messages take a look into following message contracts. There’s no magic and the whole idea is rather straightforward: serialize derivatives as fields, turning the inheritance into the composition. This working proposal of Protobuf-net will be sufficient and effective in all of your efforts of serialization of inheritance. Nice serializing!

Protobuf-net: message versioning

Welcome again. Recently I’m involved in a project where Protobuf-net library is used for the message serialization concern. The Protobuf-net library was developed by Marc Gravell of Stackoverflow. Beside standard Google Protocol Buffers concepts there is a plenty of .NET based options, like handling derivation, using standard Data Contracts, etc. This is the first post of a few, which will deal with some aspects of Protobuf-net which might be nontrivial for a beginner. Beside the posts, a project Protopedia was created to show cases of Protobuf usages. All the examples below are stored in this repository. All the posts requires a reader to get accustomed to the official Google Protocol Buffers documentation.

Versioning
The versioning of interfaces is a standard computer science problem. How to deal with sth which was released as looking in one way and after internal changes of the system it must be changed publicly.
Consider following scenario of having two versions of one message located here. As you can see, there are a few changes like the change of the name, the addition and removal of some fields. Imagine that a service A runs on the old version of the message. A sevice B uses a new version. Is it possible to send this message from A to B, and then back to A and have all the data stored in the message? With no loosing anything appended by the B service? With Protobuf’s Extensible class used as a base class it’s possible. The only thing one should remember is to do not reuse ProtoMemberAttribute tags’ values of field removed in the past. New fields should always be added with new tags’ values.

How does it work?
When Protobufs deserialize a message, all the data with tags found in the message contract are deserialized into the specific fields, the rest of data is hold in a private ‘storage’ of the Extensible class. During serialization, these additional fields are added to the binary form allowing another message consumer to retrieve fields according o their version of the message.