False sharing is dead, long live the Padded

Posted on April 11, 2016 · 2 mins read · tagged with: #False sharing #Fody #RampUpNet #threading

False sharing is a common problem of multithreaded applications in .NET. If you allocate objects in/for different threads, they may land on the same cache line impacting the performance, limiting gains from scaling your app on a single machine. Unfortunately, because of the multithreaded nature of the RampUp library it’s been suffering from the same condition. I’ve decided to address by providing a tooling rather than going through the whole codebase and apply LayoutKind.Explicit with plenty of FieldOffsets

Padded is born

The easiest and the best way of addressing cross cutting concerns in your .NET apps I’ve found so far is Fody. It’s a post compiler/weaver based on the mighty Mono.Cecil library. The tool has a decent documentation, allowing one to create a even quite complex plugin in a few hours. Because of this advantages I’ve used it already in RampUp but wanted to have something, which can live on its own. That how Padded was born.

Pad me please

Padded uses a very simple technique of adding a dozen of additional fields. According to the test cases provided, they are sufficient enough to provide enough of space to prohibit overlapping with another object in the same cache line. All you need is to:

  1. install Padded in your project (here you can find nuget) in a project that requires padding
  2. declare one attribute in your project:
namespace Padded.Fody
{
public sealed class PaddedAttribute : Attribute { }
}
  1. mark the classes that need padding with this attribute.

Summary

Marking a class/struct with one attribute is much easier than dealing with its layout using .NET attributes, especially, as they were created not for this purpose. Using a custom, small tool to get the needed result is the way to go. That’s how & why Padded was provided.


Comments

I wonder how do you got realized that there is a cache problem of false sharing ? The problem is not obvious at all until you know it.

by karolpawlowski1990 at 2016-04-19 22:20:05 +0000

This isn't the most inspiring answer ;-)

I haven't run any specific tool. I changed the code, then I rerun a performance test multiple times and it performed better.

I'd love to dig in with tools like Intel® VTune™ Performance Analyzer or Intel® Performance Tuning Utility but they cost (1000$) a lot + you need to spend some time to really know them. I'm looking forward to have this possibility one day.

by Szymon Kulec 'Scooletz' at 2016-04-20 05:02:54 +0000