Maintaining Consistent Transactional States without a Global Clock
- Hide Paper Summary
Link: https://link.springer.com/chapter/10.1007%2F978-3-540-69355-0_12
Year: LNCS 2008
Keyword: TL2; STM; Global Clock; Thread Local Counter
This paper proposes a cache friendly implementation of global commit counter commonly used by software transactional memory designs, called Thread Local Counter (TLC) . Many STM and Hybrid TM algorithm, notably Transactional Locking II (TL2) and NORec, relies on a global timestamp counter to either commit events, either for individual data items or globally. The most straightforward implementation of the counter is to use a 64 bit integer incremented every time a transaction commits. Verification of read sets is performed by comparing the value of the counter when the transaction begins and the most up-to-date value on data items or of the same counter.
Simple as it is, the straightforward implementation does not scale. On today’s multicore architecture, communication between cores and sockets has become a major source of overhead for memory instructions. Using an integer as the global timestamp counter will cause the cache line to be circulated around the cores that increment and read the cache line. This can greatly increase the latency of memory instructions, and also incur contention. This issue has already been recognized in other areas of research, and can be addressed by decentralization of the global counter. For example, in epoch-based garbage collection, the global counter which counts the number of active threads can be replaced by an array of thread local timestamps that record the last time the thread is active. The GC algorithm, instead of collecting garbage memory for epoches whose counter have dropped to zero, scans the array of thread local timestamps, and calculates the minimum. Garbage memory freed before the minimum timestamp can be freed, as no thread can possibly hold a reference to it.
TL2 is a special type of Multiversion OCC (MV-OCC) algorithm in which only one version per data item is maintained. TL2 transactions validate the read set on every read, maintaining a consistent snapshot during the lifetime of the entire transaction. The classical TL2 transaction obtains a begin timestamp (bt) from the global timestamp counter at transaction begin. On transactional read, they sample the version of the data item, perform read, and then sample again. The transaction will abort if the second sample is locked, or if two versions disagree, or if versions agree but are larger than current bt. On transaction commit, the read set is validated again after all dirty items are locked. This is followed by a write back phase, in which the global commit counter is incremented, and all dirty values are written back. The incremented value of the commit counter is the commit timestamp ct. At the end of the write back phase, the version of data items are updated to be the value of ct.
One observation is that a monotonically increasing counter is too strong to identify concurrent commits within the read phase. In fact, what we need is a notation of ordering. We hope that all write operations on the read set of a transaction after it starts can be detected. Using a global counter, it is the ordering relation of integer, i.e. the ordering of commit counter increment to obtain ct and the read of the counter to obtain bt. The design of TLC follows exactly this observation. Each thread i keeps a thread local array of counters, representing the version of other threads the last time i sees them by validating a read operation. Each thread also keeps a thread local counter as its own source of commit timestamps. The version consists of three fields: The lock bit, the ct of the transaction that commits it, and the identify of the thread that runs the transaction. The validation rule changes as the following. On transactional load, two sampling are performed as well as the read operation. After that, the validation routine runs on transaction i extracts the identify of the thread j that lastly committed on the data item. Transaction i then compares the version stored in the counter against the local version of j. If the value of the local version is smaller than the version of j, then i updates its local version of j to the current version, and then aborts. Otherwise, read validation succeeds. On transaction commit, the committing transaction increments its local counter using normal store instruction, and use the after value as the ct. The value of ct, concatenated with the identity of the committing transaction, is stored into the item metadata as data items are unlocked.
This algorithm can detect all cases that transaction j overwrites transaction i’s read set, with possibilities of false positives. We prove its correctness by reasoning about the read-after-write dependency introduced by reading the value. The condition that i could successfully read the value written by j is that i’s local version of the counter is greater than or equal to the version stored in the item metadata. One step further, according to the algorithm specification, the only possibility that i could update its local counter of j is when i previously read another value written by j, and aborted. This implies that the current transaction t3 on i is after an older transaction t2 on i, which in turn sees the version written by an even earlyer transaction t1 on j. Since the update of data item versions happen at the end of the write back phase, we know that t1 must entered its write phase even before t3 starts, which implies that t1 entered its write phase before t3 does so. According to TL2’s correctness proof, transactions are serialized by the order they finishe locking the write sets. It is clear that t1 must logically occurred before t3, and it is legal that t3 read t1’s committed value.