Friday, June 24, 2011

Vector Clocks

Vector clocks are a way of versioning data so that there is an indication of the history of which actors in your system have been involved in the history of that data.

Confused?  Let's consider a 'real world' analogy to the problem vector clocks helps to solve.  You have four rugby fans:  a Leinster fan; a Munster fan; a Connacht fan and an Ulster fan.  Together they are going to try to determine who they think should be wearing the number 6 jersey for Ireland in the 2011 Rugby World Cup.  Now:
  • The fans are all able to communicate their selection for the jersey with each other
  • They can send their selection to everyone or just one other fan
  • They are able to change their mind
  • They are able to communicate concurrently. This means the Munster fan can send a text to Leinster fan while at the same time the Connacht fan can send a text to the Ulster fan.  
  • Each fan is able to give an indication of the version he got before he sends his selection on. This may or may not have influenced their own selection!
Sean O'Brien
So it starts off. The Leinster fan is using Twitter and has the other 3 lads subscribed.  He tweets: Sean O'Brien.   The Munster fan thinks about this for a while.  He can't deal with it.  He's a proud Munster fan.  The pressure gets to him and he sends a text message to the Connacht fan that the man who should be wearing number 6 is  Denis Leamy.  It's just a text to the Connacht fan; the Ulster and Leinster fan are unaware of this communication.   The Connacht fan decides decides to amuse himself and decides to agree.  He texts back Denis Leamy to the Munster lad. The Ulster lad then sends a text to the Connacht lad which reads: Stephen Ferris.  The Connacht fan can see the potential tension brewing; especially between the Munster and Leinster lads so he decides to be diplomatic and decides to change his selection to Stephen Ferris.

The Leinster lad then looks for the other lads' answers.  Unfortunately he's lost the Ulster lads phones number so he can only communicate with the Munster and Connacht lads.  He sees the Munster lad has chosen Leamy and the Connacht lad has chosen Ferris.  He can see there's clearly been a contradiction.  But he can also see that the Connacht lad has received communication from the Munster lad and the Ulster lad and that the Connacht lad has changed his mind to agree with the Ulster lad.  He sees the consensus and decides to go with Ferris (O'Brien can always play 7 or 8). Eventually the Munster fan sees it's completely overruled and goes with Ferris.

So how are fans communicating not only their selections but indicating what they knew before their selections? They are using vector clocks!

You see, these lads aren't just rugby fans; they are nerds.

The lads agree an communication protocol between themselves. Not only will they text their choice for the number 6 jersey they will also include a vector of 4 numbers.  The first number will correspond to the Leinster version, the second the Ulster version, the third the Connacht version and the fourth the Munster version. Why are they doing this?  So they can indicate what they know about other people's selection before they made their own.

Ok, I appreciate that's confusing.  But rugby itself can be confusing. So be patient.  Let's go back to the example and examine the full messages they sent.

The Leinster fan starts - he sends out his tweet to all other fans: Sean O'Brien, {1, 0, 0, 0}.  This means Sean O'Brien is the first version from the Leinster fan (indicated by the value 1 as the first element in the vector).  It also indicates the Leinster fan had no record of any other version from any other fan before sending this out (indicated by the value 0 as the other three elements).  Then the Munster fan sends Denis Leamy {1, 0, 0, 1} to the Connacht fan. This indicates two things to the Connacht fan:
Denis Leamy
  • The Munster fan got Sean O'Brien {1, 0, 0, 1} from the Leinster fan. The Connacht fan can be sure of this.  Because the first digit in the vector corresponds to the Leinster version.  This is 1.
  • The Munster fan has updated his selection to Denis Leamy. This is indicated by the last element in vector having the value 1.  The last element corresponds to the Munster version.
Then, the Connacht fan sends Denis Leamy {1, 0, 1, 1} back to the Munster fan.  The Munster fan ascertains from this message that:
  • The Connacht fan got the original Leinster choice  - since the first digit is one
  • The Connacht fan got the Munster fans 1st selection  - since the last digit is one
  • And poor old John Muldoon  because the Connacht fan has made his selection and agrees with the Munster fan.  This is ascertainable because the message is Denis Leamy and the third digit is 1.
Then, the Ulster fan sends a text with his selection to just the Connacht fan: Stephen Ferris {1, 1, 0, 0}. This indicates:
  • The Ulster fan got the original Leinster choice - since first digit is one.
  • The Ulster fan has indicated his preference for the Stephen Ferris - from the message itself and because the second digit is one.
  • The Ulster fan knows nothing of the Munster selection or Connacht's selection - since last two digits are 0.
The Connacht fan sees things are getting messy now.  He's knows:
  • Leinster fan chose O'Brien
  • Muster fan chose Leamy
  • Ulster fan chose Ferris
He thinks about it and decides to change his mind to Ferris.  So he sends Ulster: Ferris {1, 1, 2, 1}
With this message, the Connacht fan is saying:
Stephen Ferris
  • I saw the Leinster version 1. You know that's O'Brien.
  • I also saw the Munster's version 1.  You don't know that's Leamy.
  • My initial selection was something other than Ferris. You don't know what that was either.
  • My current selection is Ferris.






So overall, it is possible now to see that:
  • Leinster's selection is O'Brien; 
  • Ulster and Connacht say Ferris and Munster says Leamy
and very importantly
  • Everyone saw Leinster's selection
  • Connacht saw Munster and Ulster's selection
  • Ulster saw Connacht's second selection
This information is very important as it makes it easier to resolve the conflict. It is not the responsibility of Vector clocks to resolve the conflict. That is someone else's responsibility.  But Vector clocks make it easier for someone else to resolve the conflict because they indicate what each actor knows or doesn't know about other actor's selection. Are you listening Declan Kidney?

Declan Kidney - does he use Vector clocks?

What does this mean in architectural terms?
The Rugby fan analogy serves as a nice intro to the use of vector clocks in distributed architectures.  In a simple and conventional architecture, you often have just one relational database.   It's running on a powerful box with lots of CPU power and disk space. The database is not distributed and it is not replicated.  You can have a column for each domain entity in its corresponding table to track its version.  This is usually a numeric value which indicates the version of the domain entity.  At any time a client can check the version it has in memory with the version it has in the database to ensure it does not have stale data.  All easy. All cool.

Now, sometimes you have replication.  Usually one database is the master and one (or more) is then the slave.  If the master goes offline, the slave takes over. When the master comes back online, resync happens. Again all straight forward.  No need for vector clocks.

In a high end architecture you are going to be under pressure to scale.  You'll be under pressure to ditch relational database as they don't scale well and use a NoSql architecture instead. In this architecture to provide availability the data is replicated and it is distributed.  There is usually no dedicated master because this makes it harder to scale.  Instead there is just a ring of database nodes ( also just called servers).  Scaling involves adding more nodes with the intent of distributing the work out evenly - you see there really is no master.  The nodes can communicate with each other in a peer to peer fashion.  They need to be able to indicate what version of the data they have relative to all other nodes.  To do this they use vector clocks.

There's no master and data can still go out of sync

So suppose your data is distributed over four nodes and to guarantee high availability each piece of data replicated is across the nodes (it doesn't have to be replicated on every node but let's just keep things simple). When the data gets updated, unless you block until every single node is up to date as part of the one transaction your nodes can go out of sync.  That's slow and even if you do decide to do it, what happens if a Node goes down or a new Node joins the group?  You are going to still have conflicts.

Ok, so say you try to add a version column to your tables to represent entity versions. This will only get you so far.  It will only tell you what version the entity is in the scope of each node. It's quite possible for two Nodes to both have version 4 but because they both could have gone down and come back up at different times they data may not match!  So it's easy to see why version columns which only hold a single version number will not work.

Cue Vector Clocks!  

The vector clock tells you the version information not just for a single actor in your system but for a range of actors in your system.

Reverting back to our Rugby fan analogy.  Let's say there are four servers.
  • a request comes into your system from the leinster fan which makes the selection: Sean O'Brien. Leinster fan is identified as the first actor in the system so the vector clock is {1, 0, 0, 0}. All nodes get this version
  • a request then comes in from Munster fan.  Munster fan is identified as the last actor in the vector clock. Now, only the 4th database node gets this the update because all other servers are down.  The vector clock is {1, 0, 0, 1}. But remember only the 4th node gets this because all the other servers are down.
  • The third server comes back online and 4th node tells the 3rd node about Denis Leamy and that the vector clock is {1, 0, 0, 1}. It does this by using something like the Gossip protocol (text messages where one user contacts just one other user corresponds to the gossip protocol in our rugby fan analogy).  The 3rd Node updates to Denis Leamy and informs the 4th Node that it has done so.  Again it can do all this by Gossip protocol. It sends the message Denis Leamy {1, 0, 1, 1} back to the 4th Node.
  • Now the Ulster fan sends a request in and suggests Ferris.  The 1st node is still offline and it's a bad day for the sys admin team because the 3rd and 4th node have gone offline again.  However, the second node can process it because it has come back online.  It updates to Stephen Ferris and updates its Vector clock to {1, 1, 0, 0}. Remember the second node got no updates from the 3rd and 4th node.  It would have got them eventually but it got the request from the Ulster fan first.
  • Again the 3rd Node comes back online and again through the Gossip protocol it gets the update for Ferris.  It sees the contradiction and decides to resolve it by updating to Ferris.  It tells the 2nd node {1, 1, 2, 1}.   This shows the 2nd node that it had versions from itself and the 4th node that the 2nd node did not know about.  The 2nd node is cool with this because he also has Ferris so he has no conflict.
  • The 1st node comes back online.  Through the gossip protocol he eventually sees that the 3rd is at Ferris {1, 1, 2, 1} and the 4th Node is at Leamy {1, 0 , 0, 1}. It's a conflict.  But he sees that the 3rd node is more up to date so he goes for Ferris. It's clear that Ferris is more up to date.  Every element in its vector clock is equal to or later than the every element in the 4th node's vector clock so he updates to Ferris.  The 4th eventually realises the same and does the same.
Choosing the Actor

You'll notice in my example, I have four fans (corresponding to four clients) and four servers.  You'll also notice I had the Ulster fan hitting the 2nd Node, the Connacht fan hitting the 3rd node and the Munster fan hitting the 4th node.   
In reality:
  • Request could be hitting any node. Usually much more than one unless sys admin are having a bad day and are turning off machines to annoy you.
  • The data was been replicated on every node.  In reality you wouldn't usually do this. It would be overkill
  • And in the real world, you'd usually have way more clients than servers.  
Who were the actors? 
In the above example, it may have seemed like we chose the servers as the actors in our vector clocks but it was actually the clients.  Recall, the Leinster fan initially hit all four servers but the vector clock was just update to {1, 0, 0, 0} not {1, 1, 1, 1}. But let's consider the consequences of choosing the servers or the clients to be the actors for our vector clocks.

The server is the actor
Because there are less servers in a system this means your vector clocks will be smaller. If you've four servers your vector clocks will look like {S1, S2, S3, S4} instead of {C1, C2, C3, C4, C5, C6, C7, C8, C9, ...,C99999}. This means comparison between vector clocks will be quicker.  However it also means you can lose updates.  For example, suppose two clients are mapped to the same server and makes different updates.  The first update will be lost.

The client is the actor

In this case, vector clocks are much longer, comparisons take longer but all updates are tracked. This means that there can be a higher degree of certainty that conflicts get properly resolved.  In order to deal with the vector clocks getting unsustainable long, they can be culled periodically:  very old versions can be discarded.

Even when the client is the actor, the vector clock is still stored with the data in the server nodes.  The vector clock is always stored in the database nodes.

When are vector clocks used?
The primary use is in distributed systems. They are not responsible for resolving conflicts they are a versioning system that help someone else resolve them.  Sometimes it can be easy to resolve conflicts. For example, when there are two conflicting vector clocks, if every element in one vector clock is equal to or greater than every element in the other vector than it is easy to ascertain that that is the one more uptodate.  When that can't be done, the conflict has to be resolved by some other logic.

Vector clocks are used in systems such as Amazon's Dynamo.

References:
1. Basho's blog http://blog.basho.com/2010/04/05/why-vector-clocks-are-hard/
2. Basho's blog http://blog.basho.com/2010/01/29/why-vector-clocks-are-easy/
3. Amazon's Dynamo - http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
4. Rutgers http://www.cs.rutgers.edu/~pxk/rutgers/notes/clocks/index.html

1 comment:

  1. Simply great.nice analogy.clears concepts pretty well.thanks

    ReplyDelete