ASAP: Architecture Support for Asynchronous Persistence

- Hide Paper Summary

Paper Title: ASAP: Architecture Support for Asynchronous Persistence
Link: https://dl.acm.org/doi/10.1145/3470496.3527413
Year: ISCA 2022
Keyword: NVM; ASAP; Asynchronous Commit; Undo Logging

Back

Highlights:

Persistent transactions should be committed to the NVM image following both control dependency and data dependency. The former occurs when the same thread starts transactions. The latter occurs on RAW and WAW sequences from two different transactions on different threads (Note: WAR is not observed). These dependencies are essential for the recovered state to remain valid.
Dependency can be tracked by a per-cache line owner tag to indicate the most recent writer of the line. All later readers or writers establish data dependency with the first writer, and should be only committed after the first writer commits.
When dependency is established, the memory controller stores the dependencies in a dependency list, essentially building an DAG indicating the commit order. Persistent transactions are committed following this DAG’s topological sort order.
Order or commits of undo-logging transactions can be enforced by controlling the order that the undo logs are deleted. A transaction with its undo log still not deleted will be undone during recovery regardless of its data and log persistence progress.

Comments:

I wonder whether control dependency should be strictly followed (as ASAP does). Imagine two transactions on the same thread, one of them writes location X, the other one writes location Y. These two transactions have no dependency and can be committed entirely out-of-order. It does not matter whether one of them gets lost while the other one is undone during recovery. On the other hand, the persistency model assumed by this paper might be that the recovered state should be one that conforms to the logical ordering, which indeed requires control dependency to be followed.
I also wonder whether WAW should be tracked, if (1) There is no aliasing, and (2) The log entry also stores ordering information such that the recovery process will not roll back the data item if the data item has been written by a later, successfully committed transaction. I guess this is possible, but is too difficult to be tracked.

This paper proposes ASAP (which stands for Architectural Support for Asynchronous Persistence), a persistent transaction framework combining undo-logging and asynchronous transaction commit. The paper is motivated by the relatively inefficient commit process of undo-logging based transactions. With undo-logging, two store orderings must be observed. The first is the ordering between log entry and dirty data item, such that undo log entries must be persisted before the corresponding dirty data item does. The second is the ordering between dirty data and transaction commit, such that all dirty data generated by the transaction should be persisted before the transaction logically commits. The paper observes that the second ordering is particularly harmful for transaction performance, since the transaction cannot commit before all dirty blocks are written back into the NVM, resulting in long waiting period at the commit point. As a solution, it is more favorable to allow transactions to commit in the background by the hardware, while overlapping normal execution with the commit process. This removes the long waiting period from the critical path, and improves performance for NVM transactions.

Committing transactions in the background, however, will raise more complications than committing transactions synchronously. The paper points out two types of dependencies that would be violated under such a setting. The first is control dependency, which is established between transactions started by the same thread. These transactions must be committed strictly in the order that they are started, because otherwise, the system state after recovery would be inconsistent as no execution history could lead to such a state. The second type of dependency is data dependency, which is established by read-after-write (RAW) and write-after-write (WAW) on the same data item from different transactions on different threads. Data dependency must also be observed for transaction commit, because otherwise anomaly will occur after recovery. For RAW, if the writer transaction is uncommitted, while the reader is committed, then during recovery, the data item will be rolled back to its previous value before the writer transaction modifies it, indicating that the reader transaction has accessed a value that do not exist in the recovered NVM image. For WAW, if the first writer is uncommitted, while the second writer is committed, then during recovery, the undo log will be applied to the data item using the first writer’s log entry, which rolls back the data item to the value before the first writer modifies it. This will also lead to inconsistent state, because the second writer transaction has logically committed, but one of its writes is rolled back.

ASAP avoids the above anomalies while still allowing background transaction commit by tracking both types of dependencies as transactions are being executed. From a high level, this is achieved using two major hardware extensions. First, each cache block in the cache hierarchy is extended with a ownership field indicating the last writer of that cache line. Every transaction in the system is assigned a unique transaction ID, which is stored in the ownership tag when the cache block is modified. When another transaction attempts to read or write the same cache block, it will see a different ownership tag than its own transaction ID, and will hence record the current owner as a dependent transaction. The transaction will not be able to logically commit before all its dependent transactions commit. Control dependencies are added when a thread starts a new transaction between the new transaction and the most recent transaction the thread has started.

Second, when a transaction is ready to commit, hardware will write all its dirty data from the cache hierarchy back to the NVM, as in the standard undo-logging commit protocol. When this completes, hardware will further wait for all dependent transactions to commit, before the current one commits. A transaction is committed by first atomically clearing its undo log, meaning that the transaction will not be undone after this point, and then removing the transaction as a dependent transaction from every other outstanding transactions.

We next describe the operational details as follows. In addition to metadata mentioned in the previous paragraph, ASAP also adds per-thread metadata for tracking undo log status and the dirty write working set in the hierarchy. The L1 cache is extended with the hardware structures for storing these metadata. Each thread also has a a corresponding dependency list on the memory controller that tracks its dependent transactions and the write log head pointer. The memory controller is assumed to be in the persistence domain, meaning that data maintained by the memory controller will be persistent. These metadata is part of the context, and will be spilled and reloaded on process switches.

On transaction begin, a new transaction ID is allocated by concatenating the thread ID with a per-thread, monotonically increasing counter. The memory controller checks whether the same thread has an uncommitted transaction (by checking the thread ID part of all uncommitted transactions), and if true, adds that transaction’s ID to the new transaction’s dependency list. The new transaction is also allocated a new undo log (the head pointer of which is also sent to the memory controller for later use), and its dirty write set is cleared.

On transactional writes, the hardware checks whether the ownership tag of the block match the transaction’s ID. If they match, then no additional operation is performed. Otherwise, if the ownership tag does not match the transaction’s ID, meaning that it is the first time the block is written by the transaction, and that a WAW dependency is to be established, the hardware (1) generates an undo log entry containing pre-image of the block, and flushes the log back to the NVM immediately, (2) informs the memory controller to add the previous owner of the block to its dependent list, (3) updates the ownership tag to its own transaction ID, and (4) adds the block address to the transaction’s dirty write working set.

On transaction reads, the hardware also checks the ownership tag against the current transaction’s ID. If they do not match, the hardware informs the memory controller to add the previous owner of the block to its dependent list.

To avoid costly commit-time data write back, the hardware will also eagerly write cache blocks in the dirty write working set back to the NVM. The paper suggests that a block should be written back after it is written four times. When a block is written back, it will be removed from the dirty write working set.

Note that the ownership tag should be preserved even when a block is evicted from the cache hierarchy. The paper proposes that a small region of DRAM should be reserved to store the ownership tag of a block when it is evicted. The small region will be accessed to fetch the ownership tag when a block is read. As an optimization, ownership tags can be maintained only for those that currently belong to an uncommitted transaction. When the system enters quiescent state, i.e., a state where no uncommitted transaction is outstanding, all ownership tags stored in the DRAM can be cleared, and future accesses do not need to access the ownership tag. To reduce DRAM traffic for accessing the ownership tag, the set of addresses that have an ownership tag in the DRAM is approximately summarized with a bloom filter, which is cleared when the system enter quiescent state.

On transaction end, the hardware generates requests to write all blocks in the dirty write working set back to the NVM, and clears all transaction metadata except the write log. Transaction end does not wait for commit, and is hence an instant operation. New transactions can be started on the same thread while the old transactions are being committed in the background. When the write back of the dirty write working set completes, the transaction is ready to commit. The memory controller then monitors the transaction’s dependency list, and logically commits the transaction by discarding its write log when the list becomes empty. On transaction commit, the transaction ID is removed from the dependency list of every other uncommitted transaction.

On recovery, the memory controller scans the dependency list of all uncommitted transactions (recall the list is in the persistence domain). For uncommitted transactions, a DAG is built with regard to the dependency relation stored in the lists. These transactions are then undone using their undo log entries in the reverse order.