The cost of scan queries in Azure Table Storage

Posted on July 28, 2015 · 2 mins read · tagged with: #azure #azure table storage #cloud #performance

There are multiple articles describing the performance of Azure Table Storage. You probably read the entry of Troy Hunt, Working with 154 million records on Azure Table Storage…. You may have invested your time in reading How to get most out of Windows Azure Tables as well. My question is have you really considered the limitations of the queries, specifically scan queries and how they can consume the major part of Azure Performance Targets.

The PartitionKey and RowKey create the primary and the only index in ATS (Azure Table Storage). Depending on the query the following kinds can be distinguished:

  1. Point Queries, which are queries to retrieve a single entity by specifying a single PartitionKey and RowKey using equality as predicate
  2. Row Range Queries, which  are queries to get a set of entities defined with the same PartitionKey and a range of RowKeys
  3. Partition Range Queries, which are run with a range of ParitionKeys
  4. Full table scans, which have no predicate for ParitionKey

What are the costs and limitations of the following queries? Unfortunately, every row that is accessed by the query to perform scan over will be counted as the table operation, Tthere ain’t no such thing as a free lunch. This means, that if you scan your entire table (4th scenario), you’ll be able to process no more than 20,000 entities per second. This limits the usage of large data sets’ scans. If you have to model queries across different keys, then you may consider storing the same value twice: once under the natural Parition/RowKey pair and the second time to match the other index, to create an inverted index. If any case, you’ll have to scan through the entire data set, then using ATS is not the way to go, and you should consider some other ways of modelling your data, like asynchronous copy data to blob, etc.


Comments

"If you have to model queries across different keys, then you may consider storing the same value twice: once under the natural Parition/RowKey pair and the second time to match the other index, to create an inverted index."

I use this technique a lot. I then use prefix queries to scan range of rows with particular row key prefix. This is how inline synchronous projections work in in Streamstone. You save an event and then you can include additional table entity within the same batch (ETG) which could be an updated projection and/or snapshot of source aggregate. All done synchronously and transactionally.

BTW, I also employ multi-level filtering which is quite performant. Basically, it's a combination of range filter, which you can then combine with filter on any other column. Range filter will efficiently slice the range of rows (narrowing a data set) and then resulting set will be iterated by ATS. Simple and efficent, and cuts the number of secondary index entities you need to create.

by Yevhen Bobrov (@yevhen) at 2015-07-28 17:59:33 +0000

+1 It's important to know that a scan across rows based on a non-row condition will access all the scan entities though.

I'm working on a different take on modeling ES now in ATS. It'll be published soon :-)

by scooletz at 2015-07-28 18:34:06 +0000

> It’s important to know that a scan across rows based on a non-row condition will access all the scan entities though.

Right. But only if there is no filter on a row key range, right?

> I’m working on a different take on modeling ES now in ATS. It’ll be published soon

Cool. Interesting to know what I did wrong with Streamstone ;)

by Yevhen Bobrov (@yevhen) at 2015-07-29 09:09:16 +0000

Correct me if I'm wrong, but as far as I see the filtering is applied on a row key range, then, the iteration goes over all entities filtering with additional criteria. If your partition+row criteria filters out 100 rows and the rest of the predicate narrows it to 1, you get the only one, but the scan and goes through all the 100 eating up your storage account performance.

by scooletz at 2015-07-29 18:42:53 +0000

> You may consider the case when a new projection arrives and needs to scan through the entire event set. This can consume the mentioned 20,000 easily.

Ye, but there is no any other way around. Except not using ATS at all )))

If you need to scan the whole stream - then you will pay the full price with ATS. it's comparable to simply loading it on the client and iterating then.

But, the good news is that this is not really a problem since what cloud (and ES) allows you is to make a parallel copy of projection you want to rebuild and replay events in any pace you are ok with (without seriously interfering with writes).

by Yevhen Bobrov (@yevhen) at 2015-07-30 23:20:27 +0000

I agree.

by scooletz at 2015-07-31 05:13:37 +0000

>> If your partition+row criteria filters out 100 rows and the rest of the predicate narrows it to 1, you get the only one, but the scan and goes through all the 100 eating up your storage account performance

Ye. But it's still heaps better than iterating through all rows in partition :)))

by Yevhen Bobrov (@yevhen) at 2015-07-30 21:19:42 +0000

Yes, of course :-) You may consider the case when a new projection arrives and needs to scan through the entire event set. This can consume the mentioned 20,000 easily.

by scooletz at 2015-07-30 22:17:09 +0000