Nautral identifiers as subresources in RESTful services

There’s a project I’m working on, which provides a great study of legacy. Beside the code, there’s a database which frequently uses complex natural keys, consisting of various values. As modelling using natural complex keys may be natural, when it comes to put a layer of REST services on top of that structure, a question may be raised: how to model the identifiers in the API, should this natural keys be leaked into the API?

REST provides various ways of modelling API. One of them are REST subresources, which are represented as URIs with additional identifiers at the end. The subresource is nothing more than an identified subpart of the resource itself. Having that said, and taking as an example a simple row with complex natural key consisting of two values <Country, City> how could one model accessing cities (for sake of this example I assume that, there are cities all around the world having the same name but being in different countries and all the cities in the given country have distinct names). How one could provide a URI for that? Is the following the right one?


The API shows Warsaw as the Polish city. That’s true. This API has that nice notion of being easy to consume, navigate. Consider following example:


Now it’s a big uglier, both the country and the city name are at the end. This is a bit different for sure and tells nothing about country accessible under /api/country/Poland. The question is which is better?

Let me abuse a bit the DDD Aggregate term and follow its definition. Are there any operations that can be performed against the city resource/subresource that does not change the state of the country? If yes, then in my opinion modelling your API with resources shows something totally different, saying: hey, this is a city, a part of this country; it’s a subresource and should be treated as a part of the country ALWAYS. Consider the second take. This one, presents a city as a standalone resource. Yes, it is identified by a complex natural key consisting of two dimentions, but this is a mere implementation detail. Once a usual identifiers like int, Guid are introduced the API won’t change that much, or even better, API could accept both of them, using the older combined id for consumers that don’t want to change their usage (easier versioninig).

To sum up: do not leak your internal design either it’s a database design or an application design. Present your user a consistent view grouping resources under wings of transactional consistency.

Solrnet NHibernate integration

Every developer which creates a scalable applications with high read/write ratio finally has to move to some query oriented storage, which allows almost instant, advanced querying. Solr is one of the frequently used solutions, providing a nice performance with advanced indexing/querying. Talking to Solr needs an API and for .NET and you can use SolrNet. As it is described on the project page, it provides a nice and simple way to integrate with NHibernate, simply by calling following code:

NHibernate.Cfg.Configuration cfg = SetupNHibernate();
var cfgHelper = new NHibernate.SolrNet.CfgHelper();
cfgHelper.Configure(cfg, true); // true -> autocommit Solr after every operation (not really recommended)

The integration with NH allows you to use an extended session to query the solr server. Additionally, all your updates, inserts, deletes will be automatically called on the solr. This feature is provided by NH events listeners registered by the described configuration helper. Let’s take a look and audit a bit of code:

public Configuration Configure(Configuration config, bool autoCommit)
 foreach (Type type in this.mapper.GetRegisteredTypes())
 Type type2 = typeof(SolrNetListener<>).MakeGenericType(new Type[] { type });
 Type serviceType = typeof(ISolrOperations<>).MakeGenericType(new Type[] { type });
 object service = this.provider.GetService(serviceType);
 ICommitSetting listener = (ICommitSetting) Activator.CreateInstance(type2, new object[] { service });
 listener.Commit = autoCommit;
 this.SetListener(config, listener);
 return config;

For each of mapped types a new listener is generated. Is it OK? Why do not use one listener (with no generics at all), handling all the mapped types?
Additionally, doesn’t SetListener method clear all the previously registered listeners, so… what about the types previously handled?

The more interesting question can be raised when looking through the SolrNetListener<T> code. All the listeners in NH are singletons. They are initialized during the start up phase, and used till the application end. Hence, listeners should be stateless, or use some kind of current context resolvers passed in their constructor. The SolrNetListener uses WeakHashtables fields to store the entities which should be flushed to the solr. Static field (because the listener is a singleton)? What about the race conditions, locking etc.? Let’s take a look:

The example of Delete method, called, when solr delete should be deferred (session uses a transaction) shows that listener GLOBALLY locks the execution of all threads:

private void Delete(ITransaction s, T entity)
 lock (this.entitiesToDelete.SyncRoot)
 if (!this.entitiesToDelete.Contains(s))
 this.entitiesToDelete[s] = new List<T>();
 ((IList<T>) this.entitiesToAdd[s]).Add(entity);

furthermore, data can be committed to the solr server, but not committed to the sql database! It can happen when flushing occurs. The override of OnFlush method saves the data to the solr before flushing the db changes.
What if optimistic locking rolls back the whole transaction after the data were stored in the solr? I’d rather have no data in solr and use some background worker to upsert data in solr then have copies of nonexisting entities.

The right behavior for the solr, with no global locks, flushing changes to the solr after a sql db transaction commit can be done with using an Interceptor and rewritten the event handler. I’ll describe it in the next post.


Deiphobus design, pt. 1

It’s the right time to write about Deiphobus design. I’ll start with an example of usage, next I’ll move to configuration, serialization and model created from the config. In the next topic the event design of session implementation, the identity map and its usage will be explained as well as the lifetime of a session and query API.

The configuration provided with Deiphobus is a fluent, “just code” configuration. Each entity is described with a generic EntityClassMap.

/// <summary>
/// The mapping of the class, marking it as entity, stored under separate key.
/// </summary>
/// <typeparam name="T">The type to be mapped.
public abstract class EntityClassMap<T>
 where T : class
 protected void Id(Expression<Func<T, Guid>> id)
 protected IndexPart IndexBy(Expression> memberExpression)
 protected void SetSerializer<TSerializer>( )
 where TSerializer : ISerializer

The class interface was designed to be similar to Fluent NHibernate:

  • class specifying this type, should be a mapped entity class
  • Id method marks the specific property as id. It’s worth to notice, that only Guid identifiers are available
  • the second method is IndexBy used for marking a property to be indexed with an inverted index. Only the properties marked with this method can be queried in Deiphobus queries. Running query on a not indexed property will throw an exception
  • the very last method, allows to set a custom serializer type for the mapped entity type

All the mappings are consumed by mapping container registering all entity class maps in itself. The maps are translated into EntityClassModel object, describing the specific entity properties. This process takes place when the session factory is created. On the basis of each model class, the object implementing the interface IEntityPersister is created. The implementation of the persister provides methods like: GetPropertyValue or GetIndexedPropertyValues with IL code emitted, to overcome the reflection overhead. This class will be described later, the EntityClassModel‘s method signatures can be seen below:

/// <summary>
/// The class representing a model of mapped entity.
/// </summary>
public class EntityClassModel
 public EntityClassModel(Type classType, PropertyInfo id, object idUnsavedValue, IEnumerable<IndexedProperty> indexedProperties)
  ClassType = classType;
  Id = id;
  IdUnsavedValue = idUnsavedValue;
  IndexedProperties = indexedProperties.ToList().AsReadOnly();
 public Type ClassType { get; private set; }
 public PropertyInfo Id { get; private set; }
 public object IdUnsavedValue { get; private set; }
 public IEnumerable<IndexedProperty> IndexedProperties { get; private set; }
 public Type SerializerType { get; set; }

The very last part of this entry, is for serialization in Deiphobus. Because of the usage of Cassandra, each entity is stored under one key, in one column family, in one column. The entity is serialized in the moment of storing. The serialized entity is stored in Cassandra as well as its inverted indexes based on values retrieved just before saving the entity in the database. In the current moment, two levels of serializers can be setup:

  • the default, used by all classes not having their own
  • entity class specific

The rest of types is always serialized using the default serializer. This behavior may be subject to change.


I’ve just finished reading Dremel whitepaper. It seems that Google one more time brought to life something, which may change the IT world. Imagine queries running against trillions of rows and returning results in a few seconds, imagine fault tolerant db, that scales linear and still, allows you to query it in a very advanced ways (for instance, using grouping). Yeah, I’m aware of VoltDB but the Dremel’s description was just astonishing.

On the other hand: have you ever had a possibility to test your scalable app on 2900 servers? 🙂