Why you should and eventually will invest your time in learning about public cloud?

TL;DR

Within 2-5 years the majority of applications will be moved to public cloud. You’d better be prepared for it.

Economies of scale

You might have heard that economy of scale does not work for software. Unfortunately, this is not the case for public cloud sector. It’s cheaper to buy 1000000 processors than to buy one. It’s cheaper to buy 1000000 disks than to buy one. It’s better to resell them as a service to the end customer. And that’s what public cloud vendors do.

Average app

The majority of applications does not require fancy processing, or 1ms service time. They require handling peaks, being mostly available and costing no money when nobody uses one. I’d say, that within 2-5 years we will all see majority of them moving to the cloud. If there is a margin, where the service proves its value and it costs more than its execution in the cloud, eventually, it will be migrated or it will die with a big IT department running through the datacenter trying to optimize the costs and make ends meet.

Pure execution

The pure execution has arrived and its called Azure Functions (or Lambda if you use the other cloud:P ). You pay for a memory and CPU multiplied. This means that when there’s nothing to work on, you’ll pay nothing (or almost nothing depending on the triggering mechanism). This is the moment when you pay for your application performing actions. If an app user can pay more than the cost of the execution, you’ll be profitable. If not, maybe it’s about time to rethink your business.

Performance matters

With this approach and detailed enough measurements you can actually can see where you spend the most money. It’s no longer profiling an app for seeing where is it slow or where it consumes most of the memory. It’s about your business burning money in different places. Whether to update one or not – it’s a decision based on money and how much does it cost to fix it. With highly profitable businesses you could even flood your less performing parts with money. Just like that.

Environments and versioning

How to version a function? Preserve signature and rewrite it. Then deploy. Nothing less nothing more. I can almost see a new wave of development approaches where Continuous Delivery looks like a grandpa trying to run with Usain Bolt. You can’t compete with this. It’s a brand new league.

Summary

If you think about areas you should invest your time, public cloud and functions are the way to go. For majority of the cases, this is going to be vital to survive in the market competing and betting for the lowest costs of infrastructure, IT and devops.

Hitting internal wall in Service Fabric

TL;DR

In this post I share my experience in trying extending Service Fabric for Sewing Machine purposes.

Sewing Machine aim

The aim of Sewing Machine is to extend the Service Fabric actor model to use better, faster, less-allocating foundations. The only part I was working on so far was persisted actors. This is the one stored in the KeyValueReplica, I’ve been writing about for last few weeks.

Not-so public seam

I started my work by discovering how the persisted actors are implemented. The class responsible for it is named KvsActorStateProvider. It uses following components:

  • KeyValueStoreWrapper – private
  • VolatileLogicalTimeManager.ISnapshotHandler – internal
  • VolatileLogicalTimeManager – internal
  • IActorStateProviderInternal – internal
  • ActorStateProviderHelper – internal, responsible for shared logic among providers
  • IActorStateProvider – public interface to implement

As you can see, the only part that is given is a seam of the state provider. Every single helper that one could use to implement their own, is internal. Additionally, the provider interface is filled with other interfaces that one needs to implement. I know, sharing data structures isn’t the best option, but as the ServiceFabric share them internally, why wouldn’t you give it to the user.

All of the above means that extending actors’ runtime is hard if not impossible. It provides no real extension points and has its public seam not prepared for it. What does it mean for SewingMachine?

Can’t win, change battle

Rewriting the runtime would be time-consuming. I can’t spend half a year on writing it and don’t want to. At least, not now. I’ve implemented a faster unsafe wrap around KeyValueReplicaStore that still can be useful. To make an impact with SewingMachine, I’ll introduce the event driven actor first, even when clean up will be run by not efficient regular Actor disposal. Using a custom serializer and adhering to the currently used prefixes, later on can be changed to use a custom runtime. The only problem would be reminders, but this can be handled as well by a better versioning of them.

Summary

SewingMachine was meant to extend the actors’ runtime. Seeing difficulties like the ones described above, I could either kill it or repurpose it to provide a real value first, leaving performance for later. That’s how we’ll do it.

 

Service Fabric – KeyValueStoreReplica, ReplicaRole

TL;DR

After taking a look at how actors’ state is persisted with KeyValueStoreReplica to follo the prefix query guideline, it’s time to see how this state is replicated.

ReplicaRole

When defining replication for a partition, one defines on how many nodes the data will reside. Every copy of a partition’s data is called Replica. It’s important to know, that for a given partition only one replica at a time is active. This kind of replica is called Primary. Let’s take a look at the ReplicaRole values and decipher their meanings:

  1. Primary – the currently active replica. All operations are handled by the primary, ensuring that any write will be replicated and acknowledged by a quorum of ActiveSecondary replicas. As in The Highlander, there can be only one Primary replica at the same time.
  2. IdleSecondary – a replica that accepts and applies a state send by the Primary to catch up with all the changes and eventually become ActiveSecondary as soon as it catch up.
  3. ActiveSecondary – a replica that is a part of the write quorum. It stores updates from the Primary and acknowledge them to enable Primary to successfully end a write operation.

Active, passive and not-that-passive

As you can see above, there’s only one replica at any time that is truly active, it’s Primary. What happens with the secondaries? Can they do anything meaningful or maybe they’re just there for copying state?

First and foremost, secondary replicas receive notifications about the state being replicated. This means, that if you derive from KeyValueStoreReplica class, you can be notified about the copied key-value pairs. That’s how you can react to these changes. But how would this be helpful?

You could index the data somehow, you could notify other services, endpoints calling their methods or sending a request (in a safe manner, not failing on the notification) and much more. For instance, Actors’ Runtime uses it to capture the last timestamp for a component called VolatileLogicalTimeManager.

Summary

The role of a replica can be easily summarized as: primary – the current active replica accepting reads&writes, secondary – replicas just copying the state.

 

Service Fabric – KeyValueStoreReplica, Prefix query and actors

TL;DR

The last post presented a deep dive into capabilities of internal Service Fabric storage. We saw, that in spite of being “just a key value store”, KeyValueStoreReplica enables transactions, optimistic concurrency and an interesting approach for handling queries. It’s time to go back to Service Fabric actor model and understand how Actors have their state persisted.

Actor state manager revisited

To access its state Service Fabric actors use IActorStateManager, which enables to setting and getting states by their names. The manager provides one more method, that is called by the actor class when handling a call is ended this method is:

Task SaveStateAsync(CancellationToken token)

This is the place where the actor state is persisted. Let’s take a look at the implementation details behind the default actor state manager.

For persisted actors, the call eventually lands in the KvsActorStateProvider class which is nothing more than just a wrapper around KeyValueStoreReplica. During SaveStateAsync call:

  1. KeyValueStoreReplica transaction is opened
  2. All the states that were changed/added/removed have appropriate methods invoked (Update, Add, Remove)
  3. Transaction is committed.

It’s all good, but how to ensure that one can easily access all the states of a specific actor? How to make it easy and accessible with one call?

The key is… the key

Because KeyValueStoreReplica provides no indexes and accepts just a string as the key, you could think that the key formatting is crucial to ensure the ability to query I mentioned before. That’s true.

To make it happen the following algorithm is used to encode the key

var key = $"Actor_{actorId}_{stateName}"

As you can see every single actor state starts with the “Actor_” prefix. Then, the actorId is used, then the state name is appended. This means that every state of the actor has the same name, which means that we could use Prefix query with the following prefix to make it happen:

var prefix = $"Actor_{actorId}"

Summary

After reading this post you should understand how actors use the underlying Service Fabric store and how they ensure that all the values of a specific actor instance are collocated to make them easy to access together.

Service Fabric – KeyValueStoreReplica (2)

TL;DR

In the previous post we started the top bottom journey into the Service Fabric actor model. The very last step was encountering KeyValueStoreReplica that provides persistence capabilities for persisted actors. Today, we’ll review the capabilities of this distributed, associative data storage.

Transactional

The first and quite interesting attribute of KeyValueStoreReplica is its transactionality. You can open a transaction by calling a regular CreateTransaction method and later on commit it with CommitAsync like in the following example.

using (var tx = replica.CreateTransaction ())
{
  // do something
  var seqNumber = await tx.CommitAsync()
                          .ConfigureAwait(false);
}

This is the way actors ensure that all the states are stored atomically. No partially stored state, it’s all or nothing.

As you probably noticed, there’s a sequence number that is returned when a transaction is committed. We’ll get back to it shortly.

Key Value store

KeyValueStoreReplica is a key value store. Although it’s transactional as we saw above, it can store only binary payloads accessed by string keys. You can easily:

  • Add / TryAdd
  • Update / TryUpdate
  • Remove / TryRemove
  • Get / TryGet

a value by passing the ongoing transaction (you can’t do anything without an active transaction), the key and the value when needed. See the following example:

using (var tx = r.CreateTransaction ())
{
  r.Add (tx, "key1", new byte[] {1,2});
  r.Update (tx, "key2", new byte[] {3,4});
  var seqNumber = await tx.CommitAsync()
                          .ConfigureAwait(false); 
}

There’s an additional parameter that you can pass to all the methods changing data. It’s a sequence number.

Sequence number

The sequence number is simply an optimistic concurrency marker. Once you retrieve it either by getting it with data when Getting value or obtain it from a committed transaction, you can use it to Update or Delete values conditionally. If between your first call that resulted in obtaining the sequence number and the second used for Update or Delete something changed any value accessed in the second transaction it will fail.

This pattern may lower the need of rereading data every time and simply use the marker returned from the last committed transaction.

Prefix query

Although KeyValueStoreReplica is a key-value store it provides one additional way of querying data instead of getting the values one by one. This feature is called a prefix query. Consider the following example where two values are added with keys that have a common prefix. They can be retrieved in one call and returned as an enumerator

using (var tx = r.CreateTransaction ())
{
   r.Add (tx, "key1", new byte[] {1,2});
   r.Add (tx, "key2", new byte[] {3,4});
   var enumerator = r.Enumerate(tx, "key");
   // enumerator has: "key1", "key2"
}

Summary

In this blog post we saw that the underlying storage for stateful part of the Service Fabric cluster is much more than a dummy key value store. It is a transactional db, it enables optimistic concurrency and enables a prefix query that with a proper key design can be leveraged to do a lot. But this is a topic for another blog post.

Service Fabric – KeyValueStoreReplica (1)

TL;DR

In the previous post I introduced Sewing Machine, a helper library for building more on top of strong Service Fabric foundations. Before building a thing, one needs to know the foundations though. That’s where we start our Service Fabric journey.

Actors

One of the paradigms that are supported by Service Fabric is a Virtual Actor pattern. Actors are simple and small single threaded executions, that have their own state and a lifecycle. The Virtual Actor pattern ensures that you never need to instantiate any actor. Once you access an actor via its id, it will be created and from now on hosted somewhere in the cluster. Where? That depends on the balancer. All you need to know is an actor type and its identifier.

More about actors you can read in a good introduction provided in here

Actors state

Service Fabric actors can have multiple states. They are accessible by the StateManager property almost like a regular dictionary. If the actor is marked as Persisted, you have a guarantee that its state will be stored between calls, ensuring that in case of a machine failure, it can be restored on another one without data loss.

[StatePersistence(StatePersistence.Persisted)]
class CountingActor : Actor, ICountingActor
{
  public Task SetCountAsync(int value)
  {
    return this.StateManager
      .SetStateAsync("Counter", value);
  }
}

Additionally, multiple states of the same actor can be changed simultaneously and will be stored in a transaction. Yes, you can’t run into a situation with a partially stored state of a persisted actor.

Why would one store different states then? Think of it as a simple partitioning your actors’ data. If you don’t need the whole state every time, maybe it’s good to split them according to their use cases.

The StateManager behind the curtain implements a simple Unit of Work. If you access, set or delete the state under the same key within one call, the StateManager will track data and flush it properly in a transaction at the end of the call.

But transaction means db

If we say transaction, it means that there must be a database behind it. This is the case of Persisted actors as well. Behind the interface of IActorStateProvider providing the API for Actor state management, its implementation calls a very particular class providing access to a transactional, replicated, associative data storage component to service writers – ready for integration into any Service Fabric service. This is the database responsible for storage. It’s called KeyValueStoreReplica.

Summary

In this first blog post we followed from the high level API of actors’ framework, through the state management to the underlying database. In the following posts we’ll dive deeper into capabilities provided by this storage.

Sewing Machine for Service Fabric

TL;DR

This is an introductory post for my new Open Source journey project, Sewing Machine that aims at extending capabilities of Azure Service Fabric services in various ways. It’s focused on speed and performance, but also aims at delivering more capabilities build on top of the powerful Service Fabric foundations.

Services and actors

Service Fabric provides various capabilities. It can host processes and containers, enabling you to control resources usage on a much more granular level. You can host multiple processes on one VM, if they don’t require that much CPU/memory. Beside this, SF provides following models for writing your applications:

  • stateless services
  • stateful services
  • actor framework

These parts will be covered in the forthcoming posts.

Can we do more

I’ve spent some time reading official docs and decompiling Fabric sources and found that there are possibilities of building more on top of these strong foundations. Does more mean better? Possibly yes. I think that in Sewing Machine I’ll be able to allocate much less, pin no memory and use better serialization. This is low level. On a higher level I hope for:

  • event sourced actors
  • better use of secondary replicas (like using them for running processes etc)
  • multi actor projections

Journey, not path

Sewing Machine is in its initial phase. This is another journey project of mine as it’s a project based on uncovering what’s behind Service Fabric. I will share my findings in following blog posts and work towards reaching the aims above. This also means that version 1.0 is highly unlikely to be published within a month or two. For sure this journey will take a bit more and hopefully findings will be good enough to release it.

I hope you’re eager to do some sewing with this fabric. I am.