An Efficient Software Transactional Memory Using Commit-Time Invalidation
- Hide Paper Summary
Link: https://dl.acm.org/citation.cfm?id=1772970
Year: CGO 2010
Keyword: STM; InvalSTM; FOCC
Conflict Resolution: FOCC
Version Management: Lazy
R/W Bookkeeping: Bloom Filter
This paper presents InvalSTM, an STM design based on forward optimistic co currency control (FOCC). The paper identifies a few problems with commonly used concurrency scheme, BOCC, which is based on validation. First, BOCC only validates the committing transaction against committed transactions that can overlap its execution. If the write set of any of the overlapping transactions overlaps with the read set of the committing transaction, the latter has no choice but abort, since otherwise we violate the commit order. One consequence of this process is that a writing transaction to address X can abort all overlapping reading transactions on address X, making this scheme unfriendly for readers. Second, BOCC buffers updates to memory in its private workspace, and delays the final validation step to the commit point. Such lazy validation increases parallelism compared with eager conflict detection in which conflicts are detected as updates are made. It is, however, possible for uncommitted transaction to access inconsistent data due to concurrent read and commit. In other words, in BOCC, even if transactions are guranteed to be aborted if they reach the commit point, there is even no guarantee that this commit point will be reached eventually due to undefined behavior such as infinite loops, corrupted control flow, etc.. To deal with inconsistent reads, BOCC based TM system must perform incremental validation to ensure that all reads are consistent with regard to transaction begin time (i.e. all reads can be logically considered to have happened at an atomic time when transaction begins). This incremental validation requires checking all reads in the current read set, resulting in an O(n^2) extra cost where n is the number of reads in a single transaction (O(n) elements in the set and n checks performed, one for each read). The last problem is that for read-only transactions, BOCC still requires a validation step, which may result in the abort of the transaction. Being able to support read-only transactions efficiently has been important to STM, since in fact many non-read-only transactions in its static form may turn out as read-only.
This paper is based on invalidiation, which performs read-write set intersection in the “forward” direction (i.e. checking with transactions that have not committed, looking forward into the future). In its simplest form, a FOCC scheme does not use any timestamp counter. Transactions are started by adding itself into a global list of active transactions. Reads and writes are performed in the same way as in a BOCC transaction, that is, wrirtes are buffered in a local hash table, while reads hitting the local write set will be forwarded from the write set, not read from shared memory. On transaction validation, which happens once before commit, the validating transaction checks its write set against the read set of current active transactions. A conflict is detected if the read-write set intersection test results in non-empty set. To resolve conflicts, either the currently validating transaction aborts, or the (potentially many) conflicting transactions abort, granting a degree of flexibility here. The contention manager can even let the validating transaction wait for the conflict transaction to commit or abort before committing it (the paper does not explore this possibility, though). After validation, the transaction commits by copying speculative writes into shared memory.
The implementation of InvalSTM is based on the above description of BOCC, with certain restrictions for simplicity. First, InvalSTM does not allow concurrent transaction commits. Doing so prevents committing transactions from aborting each other, since a committing transaction may not know whether it has been aborted by another concurrent committing transaction before the latter finishes validation, which effectively requires some form of synchronization. Second, when validation is being performed on read sets of active transactions, it is required that these transactions not insert any new element into their read sets. Because otherwise, a transaction may proceed to read shared memory value after validiation is done successfully on its read set, and before the new values are committed to shared memory. This way, the transaction is serialized before the committing transaction by not observing its update, but may later serialize after the same transaction by overwriting a value that it wrote, creating a dependency cycle. The second form of serialization cannot be detected as committed transactions do not save their write sets. Third, in order to validate, the validating transaction must know the complete list of transactions that are currently active at the time it enters validation. This paper suggests that a linked list of transaction descriptors be used. During validation, no new transaction may join this list, since a new transaction may not be included in the validating transaction’s list for read-write check.
The following data structures are used by InvalSTM. First, a global commit lock is added to serialize transaction validation and commit. This lock is acquired before transaction attempts to commit, and released after all updated values are written. Note that the lock cannot be released before actual value commit, because otherwise, a later committing transaction may violate the serialization order by overwriting a value written by the former transaction. Second, a in-flight transaction linked list and an associated lock work together to serialize transaction begin with transaction validation and commit. Both the commit lock and the in-flight lock are acquired before a transaction can proceed to validate. On the other hand, transactions acquire the in-flight lock when they begin and removes themselves when they abort or are committed. Third, each transaction has a private transaction descriptor, which contains the following: A “valid” flag to indicate whether the transaction has been aborted by a validating transaction; A read-set and write-set implemented as fixed-length bloom filters. Using bloom filter for conflict detection makes validation a constant time operation, while making transactions vulnerable to false conflicts induced by address aliasing. In addition, each transaction has a private lock which is used to synchronize between the validation transaction and transactional reads. A hash table buffers new addresses for the write set, which is copied to the intended address on transaction commit.
On transaction begin, the transaction first acquires the in-flight lock, and adds itself into the in-flight linked list. All metadata in the descriptor is initialized, and the “valid” bit is set. On a transactional write, values are buffered by the hash table, and the write-set bloom filter is set to reflect the change. On a transactional read, the write hash table is first probed (maybe after probing the bloom filter and receiving a positive result) to search for dirty value written by the same transaction. If such value is found, then it is forwarded directly to the read without accessing global memory. If, on the other hand, the read has to access memory, then the private lock of the transaction is first acquired, and then the read is performed after setting the bloom filter. The lock is released after the read operation. For both reads and writes, the transaction should also check its “valid” bit in the descriptor, and abort if this bit is cleared.
On transaction validation, the commit lock and in-flight lock are both acquired. The validating transaction first validates itself by checking whether the “valid” bit is still on. Then the validating transaction acquires all private locks of all active transactions in the in-flight list. This blocks all active transactions from making any progress until the commit or abort. Next, the validating transaction performs a bitwise AND between its write set and the read sets of all active transactions. Should any non-empty intersection occurs, the identity of conflicting transactions and the current transaction are sent to the contention manager for conflict resolution. The contention manager can choose to abort the validating transaction or to abort all conflicting transactions. In the latter case, the “valid” bit of these transactions will be cleared, such that they will abort in the next read/write or validate operation. On validation success, the transaction will iterate over its write set, and apply all changes made earlier to shared memory. The commit finishes by releasing all acquired locks.
For read-only transactions, no read-write test is necessary since its write set is empty. Still, the read transaction should acquire commit lock to check whether its “valid” bit is on, since it is possible that a validating transaction just invalidates this read-only transaction right before the latter decides to commit. Neither in-flight lock nor private locks are acquired in this case, which results in fast commit of read-only transactions.