Designing a multitenant system puts a hard requirement on a designer to do not leak data between tenants. Is there anyone who would like to show a list of employees’ emails from one company to another?
Azure Table Storage
Azure Table Storage is a part of Azure Storage Services. It’s mentioned in the original Windows Azure Storage whitepaper and provides a stable foundation with known limitations, quotas and API that hasn’t changes for ages (ok, years). The most important aspect of it is the throughput which limited in two dimensions:
- partitions – a partition is defined by a partition key value and can CRUD at most 500 entities (rows) per second
- storage account – an account can process at most 10k operations per second
These two numbers can impact the performance of an app and should be taken into consideration when designing storage.
There’s a library which provides an event sourcing store on top of the Azure Table Storage. It’s called StreamStone. It provides a lot of capabilities but not a from-all projection (see this PR for more info, including my notes). This can be added (not easily), which I’ve done introducing some overhead on the write side.
Having a storage problem solved, how would you define and design a multitenant system?
One to rule them all
The initial attempt could be to add the company identifier to the partition key. Just use it as a prefix. That could work. Until one of the following happens:
- a scan query without a condition is issued – just like SELECT * without where, yeah, that would be scary
- a company uses our app in a way that impacts others – it’s easy, you need to saturate 10k operations per second
It looks that this could work, but it could fail as well. So it’s not an option.
Fortunately, Azure provides management API for storage accounts. This means that from an application, one can instantiate storage accounts under the same subscription but totally separated from each other. Like in a container or something. This boxes performance limitations for a company into its own account. The problem of a potential leakage is also addressed by storing data of a company in a totally separated account.
As mentioned by Adrian in the comments, there’s a limit of 200 storage accounts per single subscription which is a high number to reach. Once you do it, additional layer of subscription management should be applied.
Who knows them all?
Of course there’s a need of a governor. A module that will know all the accounts and that will manage them. This, is a limited surface of possible leakage, leaving a good separation for the rest of the application/system.